Asp Forum - How to Code - comp.programming

zondervanz

3/19/2015 2:12:00 PM

Perhaps of interest to the community....

How to Code, v.20150317 DRAFT
@Xer0Dynamite

Note: the original document has been moved to <http://hackerspaces.org/How_t....

== Audience ==

Programmers and coding can be divided into four camps:

1. Participating in the Programmer/Data Ecosystem (GPL and OOPv2),
2. Scientific Computing (Performance-based),
3. Business Use (Persistence and Security of Data),
4. Fun and learning (toy problems and *user*-specific domains)

This guide is mainly for those in camp #1. The rest of you, in order of precedence, will have to wait until you gravitate to the top camp and figure out that that's where you wanted to be.

== Foundations ==

I'm going to tell you, the way to mastery is long. If you just want to be a mediocre programmer, go get your certification from Microsoft, display it on your resume for everyone, and be done with it.

Otherwise, plan on growing your neckbeard (or female equivalent) and let's go.

=== Challengers ===

....On the way to becoming a master, there are two main challengers along side with you in the programmer's chair. These are like Death always ready to take you.

CHALLENGER #1: RushingToExecutable. You'll be inclined to get a program to work as soon as possible. Everybody likes to see immediate results, but down the road, this short-term savings won't be extendible--if it's even comprehensible. Most code should get re-used or appreciated, otherwise why write it? We're in the Internet Age here. I'll appreciate your slim code for your ultra domain-specific application later -- after you re-write it modularlly so I can use it for *my* applications.

The best counter-weapon for this battle is TestDrivenDevelopment.

CHALLENGER #2: Cruftiness. You wrote some code a while back. You're not sure how it works anymore, so you don't want to change it, right? You pussy-foot around it like a black-box. That's called cruft. I'm assuming _brevity_ is a shared value in this guide. You're a programmer, not a shill working for LOC, right? So, bite the bullet.

The best weapon for this battle is RefactorMercilessly which you can draw out of your bag of ExtremeProgramming.

Got it? You don't want to enter battle unprepared and without knowing the challengers.

=== Allies ===

If you're going to have opponents, you should also have allies, right? The following are powerful allies to keep close to you. Just because they're called "allies" doesn't mean they're always friendly. They are primal forces for forging YOU, the apprentice who wants to master the machine. The value of the allies exist and can be used independently of my commentary which follows (which is just my personal journey with the allies). Treat them well, so they'll stay lustrous and not abandon you. Used properly, the allies will harness the challengers and you'll be honed into a MASTER....

Being somewhat independent, the allies are "crucibles" -- they exist in a complex contradiction that YOU as the programmer must resolve. They are koans. There is no set of rules or guidebook in this terrain because you'll be forging the path, otherwise you'd earn nothing. Meditate on them.

CRUCIBLE #1 (readability): ClotheYourData and LeaveItNaked.
CRUCIBLE #2 (re-useability): SeparabilityOfDomains and ZeroSuperfluousNames.
CRUCIBLE #3 (correctness): TightCoupling and Brevity.

=== Commentary ===

This section is for accumulating discussion and experiences for working with the three allies: readability, re-useability, and correctness.

Crucible #1

Modular and object-oriented programming is about "clothing your data": putting a name or conceptual unit around your otherwise naked code or data bundles. This maxim applies to variable names, object names, and the file names holding your source. Apart from proper formatting, this is *the* step that allows your code to be readable "from the inside".

However, when your problem isn't broken-down well, it can lead to bulky, poorly-fitting structures that no one wants to re-use (and can often indicate that you're working on the wrong problem--see "folksonomy"), so it has a companion rule: LeaveItNaked (a.k.a DontContainTheUnknown).

The first example of this tension comes when naming index variables. What should you call it? Well, if you know nothing about it, LeaveItNaked would say to make it as generic as possible, like "i". But unless it's coming from the void, tie the variable name to where it originates (from the mouse: "click", keyb: "keypress"). If you know it's going to be a number or a character, call it "n" or "c" unless you're writing for beginners (in which case, you can use "num" or "ch" ;^).

Keeping your variable name as *small* and as *meaningful* as possible (another tension to optimize!) will prevent annoying variable re-naming later. The LeaveItNaked rule will help you not to put too much clothes on your data.

So, when you have a meaningful collection of data, put it in a struct or other linguistic force to group data and give it a name. It now becomes a unit. When you have this AND methods or operators to go with it, you have an object: make it so. Else, without a meaningful category or uniting force for some data, a grouping will only confuse everyone else and prematurely constrain your data.

Your programming environment, to make it enormously easier for you, has created keywords and variables: keywords are like machine instructions, variables are names to associate with data. To make things simple for everybody, you're not going to name any of your variables with a name from the list of keywords, ok? And, remember at some point, code is data, too.

These little programatic things provided by your language and computing environment allow your concepts to be flattened out into individual working expressions or *sentences* and composed together in way that is grammatically correct according to your programming language. A symbol (+,?,|,^, etc.), by the way, should be seen as an extra-short keyword.

Working together, the two directives for this maxim produce a virtuous and powerful tension with each another. End result: your code becomes beautiful.

Crucible #2:

Unless you're well-along your path already, this one is probably the toughest one crack, but once you do it your code will become technical masterpieces. This rule is the "abstract upwards and winnow the chaff rule", and is opposite of the first crucible. But walking this path, you're separating and carving out domains. At the end of the path, you have re-usable, tight code with little to no redundancy (ZeroSuperfluousNames).

You want to maximize lexical separability of program function so that they make the most meaningful units of modular code (i.e. you don't want everything lumped under one main function). BUT you also don't want a lot of excessive names polluting your namespace without good reason. What's the good reason? If there are more than two legitimate uses for your code (even if you don't use it yet), separate it into it`s own module/object/function with it's own name. Done right, there should be little to no code duplication NOR function harnesses or Objects that only get used once.

Apart from LISP machines, all computers use variables (names on data). Names for one purpose should be distinct from names of a different purpose (SeparabilityOfDomains). Constants, for example, shouldn't look like variables, nor should types look like variables. Syntax highlighting can't do this for you, so you have to perform this in the source text itself. Names act as an anchor. They are the gravitational attractors for other eyeballs which may be circulating around your code, so make them count. In some ways, languages are defined by how well they allow you to do this (having classes for objects, for example).

Besides careful name-choice, you have three techniques available for making names distinct and more meaningful: capitalization, parts of speech, and using special characters. These seem minor, compared to how well your language assists you in separating out different domains, but are included here for completeness.

1. Capitalization. Ideally, you'd use case to indicate "Parent", but this isn't practical in a real data ecosystem where objects are being examined and re-used, so use it in your source text to indicate something you intend to be passed around. That communicates something valuable to anyone who reads your code also -- it's like saying "ready for work". It's like the act of giving birth for your object (or module) to the outside world. And rather than ALLCAPS on a constant, use _underscores_ which also makes constants easier to find -- keeping them lowercase if they're for "internal use only".

2. Use verb-words for methods/functions, nouns for classes/variables (unless they're a pointer *to* a function, in which case call it a "pointerToX" or "pFunction"). This is a general rule, but for Objects that are designed to recede to the background and act as a go-between (like a network socket, for example), a verb might be more appropriate. Of course, all this is prior to satori. After enlightenment, you know that you only need << (in), >> (out), ? (query-name), % (clone) operators and that ObjectComposition will do the rest. The clone operation may even be provided by the interpreter environment.

3. If your language allows it, "?" and "!" characters at the end of function names can signify that they're a boolean query or will write-in-place, respectively.

These maxims will allow your code to be readable from the outside (the above view). Here endeth the lesson on separability.

Crucible #3:

Firstly, "tight-coupling" here strictly refers to [code <--> documentation <--> tests], not between different modules/units of your program. The first rule of documentation is ProperNaming, so be sure to follow the first two golden paths.

Let you're ReadyForWork objects have comments or DocStrings underneath their definition header to form a self-encapsulated and documented, re-usable object.

By encapsulating your documentation with the code, you help ensure it stays up-to-date. Languages like Python define this into the language with DocStrings. Python DocTests go further and facilitate testing by allowing test code in your documentation. Read the Python DocTest module source by Tim Peters for some very good reasons why it's good.

This MAXIM will make a data ecosystem fun and complete. Good tests are instructive to everyone who uses your object. Good object/module documentation makes it easier and feels safer for others to re-use your code.

End result: You become a bona-fide steward of the programmer and data ecosystem and can take part of the management and value-generation of peta-bytes of knowledge amassed by humanity. Sweet!

== Coding ==

Let's get this straight from the get-go: you're on a vonNeumann machine not a Symbolics Machine. That means computer programming consists of loading a simple programming language statement and any data that goes along with it, executing said statement (raising any errors or exceptions that occur upon such execution), and continuing forward excepting on some condition otherwise and repeating this sequence. It's like a little Turing Machine, the basis for much of Computer Science, that. It may or may not finish. All that's to say: Let's not go on a joy ride upon the language-du-jour, 'k? We will, at some point, be worried about memory and CPU performance as all good programmers should, we're just saving it to last -- just as Master Jon Bentley taught us.

So, the basic tools for forging your program are Divide, Conquer and Synthesize. That last one is new, and represents the evolution of the programming enterprise in the age of collaboration. You could say it started with the Object-Oriented Programming paradigm. The allies should be applied along the way and are referenced throughout this section.

Now, consider that the grand process of dividing, conquering, and synthesis has already started in several ways. One, you've _divided_ a chunk of human curiosity from the "master program", and have an idea of what you want to do. That's one point. Secondly, the history of computing has already _conquered_ the domain by separating I/O (generally named "stdin/stdout") from Processing in order to make a General Purpose Computer. This is a giant help and saves you buttloads of time wiring I/O junctions to your panels. Lastly, consider that you ARE the synthesis that's going to make it happen.

Beyond that, there can be several different "breakdowns" of how you go about structuring your program, but if you're constraining yourself to a particular programming language, chose the axis in which your language was designed (file-based vs. run-time based, parallel vs. serial, procedural vs. imperative dimensions, etc.). If you're not constraining yourself, choose Python/C and make it easy.

Note that your language will constrain you at the bottom (forcing syntax and semantically-valid expressions), while you constrain the program at the top (keeping your purpose for the program in mind). That leaves the middle with the most degrees of freedom. So the first thing to do is to start there, writing boilerplate. The general idea is that you're going to keep constraining yourself while making continual divisions, while the design of language keeps constraining you at the bottom forcing you to conquer your logic, with you honing and synthesizing your code until it works and becomes a work of art.

Let's begin.

1. Starting: So, you've already sliced off a big piece of the problem domain and narrowed it down to some task. You've started using CRUCIBLE #2 and now hold a piece of the programming domain for yourself. That is the first insight. Give it a name (CRUCIBLE #1), perhaps it's "Adventure". Open a new file, write some boilerplate code: giving author contact info, comment or docstring on the purpose of the program, a header for your main function (or something that that indicates program START)(CRUCIBLE #3: TightCoupling), and save the file with your [functional] name. Eventually, you're going to divide downwards until you get to the level in which your programming language stops providing natural mechanisms to control the machine. BUT instead of making a single monolithic program, you want to train yourself for modularity and re-usability (CRUCIBLE #2: SeparabilityOfDomains), because you're almost certainly making constructs that, if generalized, could be useful to others, so hold off for a bit. But feel free to ensure that it compiles correctly: run it and let it fail to accomplish its task (just as TestDrivenDevelopment suggests to do).

2. Divide: Divide your remaining problem into the multiple, conceptual [sub]units (CRUCIBLE #2) and give them simple, but meaningful names (CRUCIBLE #1). If you don't have a meaningful division (maybe it's a "hello world" program), then you've found your first function ("main" if you're at the top of a C program). Each of your subunit is now like a mini-program and has its own I/O needs. Parameters act as inputs, and return statements act as outputs, with the code inside doing the processing. So, for example, if you're making a function to return the nth factorial, don't gather the input from the user inside the factorial function which would make it less generalizeble, but in your main loop. Similarly with output. Then you have a function to add to your library for re-use (i.e. CRUCIBLE #2: SeparabilityOfDomains. You now have a little domain (computing the factorial of a number) that you've conquered and can re-use). Divide downwards when you need to conquer closer to the machine, to get the greater detail needed in order to make a compilable program.

3. Conquer: Conceive the tightest procedural structure you need to accomplish the conceptual unit you're working on, how you're going to traverse that structure (while and for loops, for example, or simply calling sequentially each subunits from your outer [main] function), and what I/O you're going to have that will communicate and coordinate between the other conceptual units (including, finally, the top-level ones the system has provided: stdin/stdout). Ideally, you don't confuse conceptual layers, which is why global variables are shunned.

There are two main idioms to communicate between your different conceptual units: MessagePassing or VariableStorage. VariableStorage is preferred up to the point that you are constrained by your language mechanisms (like within a function definition). Otherwise, global variables are sometimes used to communicate between conceptual units (i.e. different functions). MessagePassing is generally how you send data in and out of your machine (collecting or sending characters/bytes in and out). A language like Assembly doesn't have lexical structure, so variable storage is about all you use within a program. Otherwise, MessagePassing is suggested as more modular and scalable.

By way of definition: Vertical data structures go towards greater meaning (like a Person being composed of name, department, eat function, etc.), horizontal data structures are more like collection data types, holding meaningful bits of even the slightest bit of vertical data (like a numerical quantity).

4. Synthesize: If and when the conceptual units you are working with are too convoluted, find a better abstraction, something that is more generalizable which will hold it all. Make a new module, a new Object. Synthesize to make SeparationOfDomains: generalizable, re-usable, loosely-coupled, modular code. Return to step #2.

== Conclusions ==

The right outcome of the battle between You and RushingToExecutable is Maintainability (of your code) and personal Reserve.
The right outcome of the battle between You and Cruftiness is Parsimony (read: simplicity + harmony) and code Mastery.

Together, the end result is Elegance and you become the Victor! Now go get your Eye of the Tiger and roll!

== Attributions ==

Special thanks to the denizens of RefactorMercilessly, Bjarne Stroustrup for the concept of encapsulation, Niklaus Wirth for the concept of modular programming, and many other old-school programmers who forged the way. Also, WikiWords are documented at the wikiwikiweb: http://c2.c.... If beginners come your way, direct them to the OneTruePath.

3 Answers

Richard Heathfield

3/19/2015 2:43:00 PM

On 19/03/15 14:11, zondervanz@gmail.com wrote:
> Perhaps of interest to the community....

Mildly diverting. Having read it through, I have to say that I don't
agree with all of it. To save time, I'm just going to point out the
first thing I saw that I disagreed with:

> How to Code

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

JJ

3/20/2015 2:22:00 PM

Wrong approach.
One should teach the concept of programming first before shoving them actual
programming.
Otherwise, they'll just memorize them without understanding them.

Lew Pitcher

3/20/2015 2:32:00 PM

On Friday March 20 2015 10:22, in comp.programming, "JJ"
<jj4public@vfemail.net> wrote:

> Wrong approach.

Agreed.

> One should teach the concept of programming first before shoving them
> actual programming.
> Otherwise, they'll just memorize them without understanding them.

If the person understands programming logic and concepts, they can apply
that knowledge to /any/ programming language and /any/ programming
requirement.

However, if they only understand "how to code in language X", they only know
how to code in language X.

--
Lew Pitcher
"In Skills, We Trust"
PGP public key available upon request

comp.programming

How to Code

zondervanz

Richard Heathfield

JJ

Lew Pitcher

x Login to ForumsZone