Chapter 19

Managing Code in Large Projects

.

It can happen--if you practice the Art of Memory--that the symbols you put next to one another will modify themselves without your choosing it, and that when next you call them forth, they may say something new and revelatory to you, something you didn't know you knew.
--John Crowley, Little Big

So far the examples given have been quite small and would therefore be most easily handled in a single .dm code file. However, for large projects, it may be advantageous to split your code up into several files. For example, the basic combat system might be defined in one file, monsters in another, magic scrolls in another, and so on. In some cases, different people may be working on various aspects of the project, making it convenient to split the code along those lines.

Another reason to use multiple files is to write code which can be re-used in multiple projects. Such files are often called library files. You may define such files yourself or you may use some that other people have created.

1. Including Files

The contents of one source file may be inserted into another by using the #include command. There are two forms depending on where you want the compiler to look for the file. In one case it only looks in the library directory and in the other it looks in the current directory first and then in the library directory if necessary.

#include <libfile>
#include "dmfile"
libfile is a library file.
dmfile is a source code file.

If the same file is included multiple times, only the first request is processed. This prevents trouble in cases where several files in a project all include the same library file.

All projects implicitly include the file <stddef.dm> at the top of the code. This file defines a few standard constants and makes some basic definitions.

Besides inserting source code files (ending in .dm), the #include command is also used to attach map files (ending in .dmm) to a project. The syntax is the same in either case. Map files are inserted into the main world map in successive z-levels and the x-y boundaries of the main map are automatically adjusted to fit the largest map which is included.


Figure 19.28: Pre Pre-processing

If you are using the Dream Maker interface to manage your project, you will very rarely have to use the standard #include and #define FILE_DIR macros. The reason is that Dream Maker automates these functions through the interface.

As mentioned earlier, code and map files are included by selecting the corresponding check-boxes in the file tree. The file referencing is handled with similar ease. All files in the project directory and below are automatically recognized without the need to supply directories or manually create FILE_DIR entries. For instance, if your project is in the directory world, you may access the file world/icons/mobs/warrior.dmi by simply supplying the name, warrior.dmi.

One time when you might need to use #include directly is to force a file to be processed before the code which follows. Section 19.3.1 describes the few situations in which the order of DM source code does matter.


2. The Preprocessor

The #include command is one of several preprocessor commands. They all begin with "#" and are placed on a line by themselves. (The DM preprocessor is identical to the one used in C and C++ compilers. The commands are often called preprocessor directives by C programmers.) The preprocessor handles such commands while the code file is initially being read and can be used to alter the appearance of the file as seen by the compiler. Additional preprocessor commands will be described in the following sections.

2.1 #define

The #define command creates a preprocessor macro. Any subsequent occurrences of the macro in the code will be replaced by the contents of the macro. Places where the macro occurs as part of another word or inside a text string do not count and will not be substituted.

#define Name Value
Name is the macro name.
Value is the text to substitute in its place.

This and all other preprocessor commands are terminated by the end of the line. Therefore, if you wish to extend it onto multiple lines, you must use \ to escape the end of the line.

The name of the macro may consist of upper and lowercase letters as well as digits and the underscore, as long as the first character is not a digit. By convention, macros are often named in all uppercase, but this is not a requirement.

It is also possible to have the macro take arguments which may then be substituted into the replacement text as desired.

#define Name(Arg1,Arg2,...) Value
Arg1 is the name of the first argument.
Arg2 is the name of the second argument, etc.

Wherever the argument names appear in the replacement text, they will be replaced by the values passed to the macro when it is used. Such a macro can be used like a procedure, but since it operates at the textual level, it is possible to do things which would not be possible with a procedure.

Care should be taken when using macros in expressions. Since the macro substitution simply inserts text from one place into another there is no guarantee that expressions within the macro will be evaluated before being combined with an outer expression. To be safe, you can put parenthesis around macro expressions to ensure they do not get combined in some unforeseen way with the external code.

The following code, for example, uses this technique to prevent the bitshift operator << from taking a lower order of operations when the macro is used in some larger expression.

#define FLAG1 (1<<0)
#define FLAG2 (1<<1)
#define FLAG3 (1<<2)

2.2 Special Macros

There are a few macros with special meanings. These are described in the following sections.

2.2.1 FILE_DIR

The FILE_DIR macro defines the search path for resource files (i.e. files in single quotes). Unlike most macros, it may be defined multiple times in a cumulative fashion. Subsequent definitions simply add to the list of paths to search.

#define FILE_DIR Path
Path is the location of resource files.

By using this macro, you can avoid entering the full path to resource files but can instead just enter the name of the file and let the compiler find it for you. Of course this would lead to confusion if the files in all the specified locations do not have unique names. If that happens, the first one found will be used.

The following example is a typical case. It simply defines two directories--one for icons and one for sounds.

#define FILE_DIR icons
#define FILE_DIR sounds

2.2.2 DEBUG

The DEBUG macro enables the inclusion of extra debugging information in the dmb file. This makes the file bigger and will result in very slightly slower execution of procedures. However, the advantage is that when a proc crashes, it will tell you the source file and line number where the problem occurred. Without the extra debugging information, only the name of the procedure is reported.

#define DEBUG

It doesn't matter if you give DEBUG a value or not. Just defining it turns on debugging mode.

2.2.3 __FILE__

The __FILE__ macro is replaced by a text string containing the name of the current source file. This may be useful when generating debugging error messages.

2.2.4 __LINE__

The __LINE__ macro is replaced by the number of the current source line being read. This too may be useful when generating debugging error messages. The following example demonstrates this.

proc/MyProc()
   //blah blah

   world.log << "[__FILE__]:[__LINE__]: We got this far!"

   //blah blah

2.2.5 DM_VERSION

The DM_VERSION macro is the version number of the compiler (217 at the time I am writing). This could be used by library writers when the code requires new language features that were not available before a certain version or if the syntax changed in some way. By using conditional compilation or the #error command, one could make the library code adapt to earlier versions of the compiler just in case someone tries to use one.

2.3 #undef

The #undef command removes a macro. In the code that follows, the macro will no longer be substituted. This could be used at the end of library files to prevent any macros that are used internally from taking effect in the code that includes them.

2.4 Conditional Compilation

The preprocessor can be used to skip sections of code conditionally. The condition usually depends on the existence or value of other macros. In this way you can turn on or off features in the code by configuring a few macro definitions at the top of the project.

The commands for conditionally compiling code are described in the following sections.

2.4.1 #ifdef

The #ifdef command compiles the code which follows only if the specified macro has been defined. The section is terminated by the #endif command.

#ifdef Macro
//Conditional code.
#endif

There is also a #ifndef command which has the opposite effect. The code that follows is only compiled if the macro is not defined.

The DEBUG macro is sometimes used to turn on certain debugging features in the code. The following example demonstrates this technique.

#ifdef DEBUG
mob/verb/GotoMob(mob/M in world)
   set category = "Debugging"
   usr.loc = M.loc
#endif

2.4.2 #if

The #if command is a more general version of the #ifdef command because it can take any expression involving other macros and constants. If the expression is true, the code which follows is compiled. Otherwise it is skipped. Alternate conditions can be supplied with the #elif command and a final section to be compiled if all else fails may follow the #else command.

#if Condition
//Conditional code.
#elif Condition
//Conditional code.
#else
//Conditional code.
#endif

The condition may involve any of the basic operators but usually only uses the boolean operators. One addition is the defined instruction which tests if the specified macro has been defined.

defined (Macro)
Macro is the name of a macro.
Returns 1 if macro has been defined and 0 if not.

One common use of the #if command is to block out a section of code. This is sometimes done in the course of debugging or possibly to turn off a feature without throwing away the code. The following example demonstrates this technique.

#if 0
   //Disabled code.
#endif

Since DM allows comments to be nested (one inside another) it is also possible to accomplish the same thing by putting /* */ around the disabled code. It is a C programmer's habit to use the #if command because many C compilers get confused by nested comments.

2.5 #error

The #error command stops compilation and displays the specified error message. Library writers can use this to tell the user of the library if something is wrong.

#error Message

The following example will refuse to compile if the DM macro is not defined.

#ifndef DM
#error You need to define DM as the name of your key!
#endif

3. Some Code Management Issues

There are a few things to keep in mind when working with large DM projects. First and foremost one must strive for simplicity. The art of programming is mostly a matter of realizing your own limitations and compensating for them.

If, as the project grows, each new piece of code depends upon the details of every previous piece of code, the complexity of the project is growing exponentially. Before you know it, the code will rise up in revolt and stick you in a dark smelly dungeon. End of story.

Fortunately, most programming tasks do not require exponential complexity. With a good design, you can split the project into pieces which interact with each other in a fairly simple way. These pieces are often called modules which is why this practice is termed modular programming. (It is interesting to note, however, that all such schemes to avoid exponentially complex code ultimately fail. They only move the exponential growth to a higher level--from individual statements to procedures to objects and on and on. It may be true that complexity will always win out in the end and that every project undergoing perpetual growth must periodically be redesigned from scratch in order to remain comprehensible. Or perhaps this tendency is merely the result of a periodic increase in wisdom to offset the inevitable decline in intelligence. In my own case, I know this to be a prominent factor.)

Although the term module can refer to any unit of code, it most often is embodied by a file or group of files. The public parts of the module are those procedures, variables, and object types which are advertised for use by code outside the module. This is called the module interface and defines the syntax for putting information in and getting results out of the module. All other private material is considered internal to the module and is not for use by outside code.

When devising a project, one should foresee the function of the different component modules and have a rough idea of the interface to each one. When work commences on a module, it is worth putting a description of the public interface in a comment at the top of the file. This helps focus development along lines consistent with a good clean interface. You will also find it a useful reference in the future when you or someone else needs to use the module. You won't need to page through expanses of code to figure out how to operate your wonderful gadget.

3.1 Ordering Code

In many cases, the sequential order of DM code makes no difference. For example, a procedure, variable, or object type may be defined before or after being used in the code. This is different from some languages which require every symbol to be defined prior to being used.

There are a few cases, however, when the order of code does matter. The preprocessor, for example, operates strictly sequentially from top to bottom of the code. The principle consequence of this is that macro definitions must precede their use. This is one good reason to instead use constant variables for the purpose when it is possible.

Another time when code sequence matters is when overriding object procedures or variable initializations. If the same procedure is overridden several times in the same object type, subsequent versions take precedence and will treat previous ones as their parent procedure.

One might, for example, add functionality to the client.Topic() procedure in several different locations in the code. As long as you remember to execute the parent procedure each time, the additions are cumulative.

client/Topic(T)
   if(T == "introduction")
      usr << "Once upon a time..."
   else ..()

client/Topic(T)
   if(T == "help")
      usr << "The situation is helpless."
   else ..()

As written, these two definitions of the Topic procedure can fall in any order with any amount of intervening code. If one of them neglected to call ..(), however, it would disable any previous versions of the procedure. It is therefore good practice to always call the parent procedure unless you specifically wish to disable it. Then you don't have to worry about maintaining any special order of the procedures.

3.2 Debugging Code

Bugs happen. Actually that is an understatement in large projects. Bugs happen frequently. This is fortunate, because there is nothing more satisfying than exterminating a bug.

3.2.1 Good Coding Habits

The novice programmer has far too much faith in the compiler. The veteran bug hunter, however, knows that just because the code compiles doesn't mean it works. It could still be infested with potential problems.

The first rule for successful debugging is to compile the code yourself. Of course you do not need to generate the byte code by hand; that's what the compiler is for. Compiling the code yourself means reading through the code you have written as though you were the compiler and making sure what the compiler sees matches what you intended.

The second good debugging habit is to run the code yourself. Initialize the necessary variables to some typical values and step through the procedure in your mind. The server can catch simple errors, but only you know what the code is supposed to do, so only you can tell the difference between code which runs and code which actually works. After doing a typical case, also be sure to think through any exceptional cases which may occur. For example, with a loop, you should verify that the first and last iteration will operate as expected.

After doing these pre-checks, it is, of course, vital to test the code for real. This is known as beating on the code. Don't be gentle. Treat it roughly to expose any unforeseen weaknesses. If it is code which responds to user input, try doing the usual things and then try things you wouldn't normally expect.

Code which has passed these three tests will be reasonably sound. By catching bugs early, you save yourself a lot of trouble, because the code is fresh in your mind and therefore easier to decipher. Besides, you will find that deciphering bug reports from other users can be even harder!

3.2.2 Elusive Bugs

Even when you have been careful, some subtle problems may still occasionally slip through. Hunting them down can be a frustrating experience, so it is good to have a few tricks up the sleeve.

There are two types of bugs: proc crashers and silent errors. Those that crash procs are the result of some exceptional case occurring. For example, the code might be trying to access an object's variable but the object reference is null. Allowing this sort of case to silently slide by (by pretending the variable of the non-existent object is null, for example) might be a convenient thing to do in some cases, but in others it might cover up a genuine error that needs to be corrected by the programmer. Crashing the procedure and reporting the problem therefore makes it much easier for you to discover the problem and find its source.

When the procedure crashes, some diagnostic information is output to world.log. When running the world directly in the client, this information is displayed directly in the terminal window. With a stand-alone server, it is normally in the server's output but may be redirected to a file.

The most important part of the diagnostic information is the name of the procedure that crashed. The value of the src and usr variables are also included. If there are any procedures in the call stack (that is, the procedure which called this one, and the procedure which in turn called it, and so on) these are displayed.

If this is not enough information for you to locate the source of the problem, try compiling the world with the DEBUG macro defined. This will add source file and line number information to the diagnostic output.

One may also need to probe around in the code to see what is going on. This can be accomplished by sending your own diagnostic information to world.log. For example, you might to know the value of a variable at a particular point in the code. This could be done with a line like the following:

world.log << "[__LINE__]: myvar = [myvar]"

Sometimes debugging output such as this is simply removed after fixing the problem, but sometimes you may want diagnostic information to appear whenever you are testing the code. In this case, a macro such as the following may be useful.

#ifdef DEBUG
#define debug world.log
#else
#define debug null
#endif

You can then send output to debug and it will be ignored when not in DEBUG mode.

Another tool for hunting bugs is to comment out code. This may be helpful when determining whether a certain piece of code is responsible for an observed problem. By simplifying the procedure and gradually disabling all but the code which causes the glitch, you can save yourself from scrutinizing a lot of irrelevant material.

This is also essential when asking others for help. Nobody wants to read through pages and pages of somebody else's code. If you can't see the problem yourself but can isolate it down to a small piece of code, you will find it much easier (and fruitful) when getting help from other programmers. Sometimes just trying to clearly define the problem enables you to suddenly see the solution yourself--avoiding the embarrassment altogether.