Small
The first rule of functions is that they should be small.The second rule of functions is that they should be smaller than that.Functions should hardly ever be 20 lines long.
Blocks and indenting
This implies that the blocks within if statements, else statements, while statements and so on should be one line long.Probably that line should be a function call.This also implies that functions should not be large enough to hold nested structures.Therefore, the indent level of a function should not be greater than one or two.
Do one thing
If a function does only those steps that are one level below the stated name of the function, then the function is doing one thing.After all, the reason we write functions is to decompose a lager concept into a set of steps at the next level of abstraction.
Another way to know that a function is doing more than one thing is if you can extract another function from it with a name that is not merely a restatement of its implementation.
Sections within functions
Fuctions that do one thing cannot be reasonably divided into sections.
One level of abstraction per function
In order to make sure our functions are doing one thing, we need to make sure that the statements within our function are all at the same level of abstraction.Mixing levels of abstraction within a function is always confusing.
Reading code from top to bottom: the stepdown rule
We want to code to read like a top-down narrative.We want every function to be followed by those at the next level of abstraction so that we can read the program, descending one level of abstraction at a time as we read down the list of functions.I call this The Stepdown Rule.
Making the code read like a top-down set of TO paragraghs is an effective technique for keeping the abstraction level consistent.
Switch Statements
It's hard to make a small swith statement.By their nature, switch statements always do N things.Unfortunately we can't always avoid switch statements, but we can make sure that each switch statement is buried in a low-level class and is never repeated.We do this with polymorphism.
The solution to this problem is to bury the switch statement in the basement of an Abstract_Factory, and never let anyone see it.My genral rule for switch statements is that they can be tolerate if they appear only once, are used to create polymorphic objects, and are hidden behind an inheritance relationship so that the rest of the system can's see them.
e.g:
public abstract class Employee{
public abstract boolean isPayDay();
public abstract Money calculatePay();
public abstract void deliverPay(Money pay);
}
public interface EmployeeFactory{
public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType;
}
public class EmployeeFactoryImpl implements EmployeeFactory{
public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType{
switch(r.type){
case COMMISSIONED :
return new CommissionedEmployee(r);
case HOURLY :
return new HourlyEmployee(r);
case SALARIED :
return new SalariedEmployee(r);
default :
throw new InvalidEmployeeType(r.type);
}
}
}
Use Descriptive Names
Don't be afraid to make a name long.A long descriptive name is better than a short enigmatic name.Use a naming convention that allows multiple words to be easily read in the function names, and then make use of those multiple words to give the function a name that says what it does.
Don't be afraid to spend time choosing a name.Indeed, you should try several different names and read the code with each in place.
Choosing descriptive names will clarify the design of the module in your mind and help you to improve it.It is not at all uncommon that hunting for a good name results in a favorable restructuring of the code.
Be consistent in your names.Use the same phrase, nouns, and verbs in the funcion names you choose for your modules.The same phraseology in those names allows the sequence to tell a story.
Function Arguments
The ideal number of arguments for a function is zero.Next comes one followed closely by two.Three arguments should be avoided where possible.More than three requires very special justifiaction and then shouldn't be used anyway.
Common Monadic Forms
There are two very common reasons to pass a single argument into a function.You may be opearting on that argument, transforming it into something else and returning it.A somewhat less common, but still very useful form for a single argument function is an event.In this form there is an input argument but no output argument.
Flag arguments
Flag arguments are ugly.Passing a boolean into a function is truly terrible practice.We should split the functon into two functions.
Dyadic functions
Dyads aren't evil, and you will certainly have to write them.However, you should be aware that they come at a cost and should take advantage of what mechanims may be available to you to convert them into monads.
Triads
Functions that take three arguments are significantly harder to understand than dyads.The issues of ordering, pausing, and ignoring are more than doubled.
Argument objects
When a function seems to need more than two or three arguments, it is likely that some of those arguments ought to be wrapped into a class of their own.
Argument list
Sometimes we want to pass a variable number of arguments into a function.They are quivalent to a single argument of type List.Like Object... args
Verbs and keywords
Choosing good names for a function can go a long way toward explainint the intent of the function and the order and intent of the arguments.In the case of a monad, the function and argument should form a very nice verb/noun pair.
Using this form we encode the names of arguments into the function name.This strongly mitigates the problem of having to remember the ordering of the arguments.
Have no side effects
Side effects are devious and damaging mistruths that often result in strange temporal couplings and order dependencies.If you must have a temporal coupling, you should make it clear in the name of the function.
Output Arguments
Arguments are most naturally interpreted as inputs to a function.If you have been programming for more than a few years,I'm sure you've done ad double -take on an argument that was actually an output rather than an input.In general output arguments should be avoided.If your function must change the state of something, have it change the state of its own object.
Command Query Separation
Functions should either do something or answer something, but not both.Either your function should change the state of an object, or it should return some information about that object.Doing both often leads to confusion.
Prefer exceptions to returning error codes
Returning error codes from command functions is a subtle violation of command query separation.It promotes commands being used as expressions in the predicates of if statements.This does not suffer from verb/adjective confusion but does lead to deeply nested structures.When you return an error code,you create problem that the caller must deal with the error immediately.On the other hand, if you use exceptions instead of returned error codes, then the error processing code can be separated from the happy path code and can be simplified
e.g:
if(deletePage(page) == E_OK){
if(registy.deleteReference(page.name) == E_OK){
if(configKeys.deleteKey(page.name.makeKey()) == E_OK){
logger.log("page deleted");
}else{
logger.log("configKey not deleted");
}
}else{
logger.log("deleteReference from registry failed");
}
}else{
logger.log("delete failed");
return E_ERROR;
}
===>
try{
deletePage(page);
registry.deleteReference(page.name);
configKeys.deleteKey(page.name.makeKey());
}catch(Exception e){
logger.log(e.getMessage());
}
Extract try/catch blocks
Try/Catch blocks are ugly in their own right.They confuse the structure of the code and mix error processing with normal processing.So it is better to extract the bodies of the try and catch blocks out into functions of their own.
e.g
try{
deletePage(page);
registry.deleteReference(page.name);
configKeys.deleteKey(page.name.makeKey());
}catch(Exception e){
logger.log(e.getMessage());
}
===>
public void delete(Page page){
try{
deletePageAndAllReferences(page);
}catch(Exception e){
logError(e);
}
}
private void deletePageAndAllReference(Page page) throws Exception{
deletePage(page);
registry.deleteReference(page.name);
configKeys.deleteKey(page.name.makeKey());
}
private void logError(Exception e){
logger.log(e.getMessage());
}
Error handling is one thing
A function that handles errors should do nothing else.This implies that if the keyword try exists in a function, it should be the very first word in the function and that there should be nothing after the catch/finally blocks.
the error.java dependency magnet
Returning error codes ususlly implies that there is some class or enum in which all the error codes are defined.Classes like this are a dependency magnet;many other classes must import and use them.Thus when the Error enum changes, all those other classes need to be recompiled and redeployed.This puts a negative pressure on the Error class.Programmers don't want to add new errors because then they have to rebuild and redeploy everthing.So the reuse old error codes instead of adding new onews.When you use exceptions rather than error codes, then new exceptions are derivatives of the exception class.They can be added without forcing any recompilation or redeployment.
Don't repeat yourself
Duplication may be the root of all evil in software.Many principles and practices have been created for the purpose of controlling or eliminate it.Consider, for example, that all of Codd's database normal forms serve to eliminate duplication in data.Consider also how object-oriented programming serves to concentrate code into base classes that would otherwise be redundant.Structured programming, Aspect Oriented Programming,Component Oriented Programming, are all, in part,strategies for eliminating duplication.It would appear that since the invention of the subroutine, innovations in software development have been an ongoing attempt to eliminate duplication from our source code.
Structured Programming
Some programmers follow Edsger Dijkstra's rules of structured programming.Dijkstra said that every function, and every block within a function, should have one entry and one exit.Following these rules means that there should be one return statement in a function, no break or continue statements in a loop, and never, ever, any goto statements.
While we are sympathetic to the goals and disciplines of structured programming,those rules serve little benefit when functions are very small.It is only in large functions that such rules provide significant benefit.
So if you keep your functions small, then the occasional multiple return, break, or continue statements does no harm and can sometimes even be more expressive than the single-entry, single-exit rule.On the other hand, goto only makes sense in large functions,so it should be avoided.
Writing software is like any other kind of writing.When you write a paper or an article, you get your thoughts down first,then you massage it until it reads well.The first draft might be clumsy and disorganized, so you wordsmith it and restructure it and refine it until it reads the way you want it to read.
Conclusion
Master programmers think of systems are stories to be told rather than programs to be written.They use the facilities of their chosen programming language to construct a much richer and more expressive language that can be used to tell that story.Part of that domain-specific language is the hierarchy of functions that describe all the actions that take place within that system.In an artful act of recursion those actions are written to use the very domain-specific language they define to tell their own small part of the story.