Go to the previous, next section.
This document describes some of the common errors that occur with the use of ILU, and some techniques for dealing with them.
Our support for C++ currently depends on having the constructors for all
static instances run before main()
is called. If your compiler or interpreter
doesn't support that, you will experience odd behavior. The C++ language
does not strictly mandate that this initialization will be performed, but most compilers
seem to arrange things that way. We'd like to see how many compilers do not;
if your's doesn't, please send a note to [email protected]
telling us
what the compiler is.
ILU uses the static-object-with-constructor trick to effect per-compilation-unit startup code. In certain cases you'll want to ensure that a certain compilation unit's initialization is run before another's. While C++ defines no standard way to do this, most compilers work like this: compilation units are initialized (static object construtors run) in the order in which they are given to the link-editor. We ([email protected]
) want to hear about any exceptions to this rule.
Note for Windows users: Please refer to the chapter "Using ILU with Microsoft Windows" to see how ILU trace debugging is handled for Windows applications.
ILU contains a number of trace statements that allow you to observe the progress of
certain operations within the ILU kernel. To enable these, you can set the environment
variable ILU_DEBUG
with the command
setenv ILU_DEBUG "xxx:yyy:zzz:..."
where xxx, yyy, and zzz are the names of various trace classes.
The classes are (as of December 1997) packet
, connection
,
incoming
, export
,
authentication
,
object
, sunrpc
, courier
, dcerpc
, call
,
tcp
, udp
, xnsspp
,
gc
, lock
, server
, malloc
, mainloop
, iiop
, http
, error
, sunrpcrm
, inmem
, security
thread
, lsr
, type
and binding
. The special class ALL
will enable all trace statements: setenv ILU_DEBUG ALL
.
The special class MOST
will enable all trace statements except lock
, and malloc
: setenv ILU_DEBUG MOST
.
The environment variable ILU_DEBUG_FILE
may be used to direct debugging output to a file.
The function ilu_SetDebugLevelViaString(char *trace_classes)
may also be called from
an application program or debugger, to enable tracing. The argument trace_classes
should be formatted as described above.
ILU_DEBUG
may also be set to an unsigned integer value, where each bit set in the binary
version of the number corresponds to one of the above trace classes. For a list of the
various bit values, see the file `ILUHOME/include/iludebug.h'. Again, you can
also enable the tracing from a program or from a debugger, by calling the
routine ilu_SetDebugLevel(unsigned long trace_bits)
with an unsigned integer argument.
The routine ilu_SetDebugMessageHandler
allows
an application to specify an alternate routine to be called when an error
or debugging message is to be printed.
[ILU kernel]: void ilu_SetDebugMessageHandler (void (*handler) (char *formatSpec, va_list args))
Locking: unconstrained
Registers handler with the ILU kernel to be called
whenever a debugging or error message is output via
, instead
of the default handler, which simply prints the message to ilu_DebugPrintf
stderr
, using
.
Two special constant values for handler are defined, vfprintf
,
which will cause the default behavior to be resumed, and
ILU_DEFAULT_DEBUG_MESSAGE_HANDLER
, which will cause debugging and
error messages to be simply, silently, discarded.
ILU_NIL_DEBUG_MESSAGE_HANDLER
islscan
The islscan
program is supplied as part of the ILU
release. It runs the ISL parser against a file containing an
interface, and prints a "report" on the interface to standard output.
It can therefor be used to check the syntax of an interface before
running any language stubbers.
Setting the environment variable ISLDEBUG
to any value (say, "t"),
before running any ILU stubber or the program islscan
,
will cause ILU's parser to print out its state transitions as
it parses the ISL file. If you're having a serious problem finding
a bug in your ISL file, this might help.
Report bugs (nah! -- couldn't be!) to the Internet address
[email protected]
, or to the XNS address
ILU-bugs:PARC:Xerox
. Bug reports are more helpful with some
information about the activity. General comments and suggestions can be
sent to either [email protected]
or ILU-bugs
.
Often the our first reply to a bug report is a request for a typescript that shows the bug occurring, with all trace debugging turned on. If that doesn't make it clear to us, our second reply may be a request for a stack trace, with printouts of relevant variables and data strutures. Including these things in your bug report may speed the cycle of interactions.
gdb
When using ILU with C++ or C or even Common Lisp,
running under the GNU debugger gdb
can be helpful for
finding segmentation violations and other system errors.
ILU provides a debugging trace
feature which can be set from gdb
with the following command:
(gdb) p ilu_SetDebugLevel(0xXXX) ilu_SetDebugLevel: setting debug mask from 0x0 to 0xXXX $1 = void (gdb)
The value XXX is an unsigned integer as discussed in section 3.
The debugger dbx
should also work.
We are in the midst of installing a consistent new way of handling rutime failures into the ILU runtime kernel. This new way involves the kernel reporting the failure to its caller; the old way involves combinations of panicking, reporting to the user (not the caller) via a printed message, and fragmentary reporting to the caller. Every time a runtime failure is noted the new way, the procedure _ilu_NoteRaise
in `ILUSRC/runtime/kernel/error.c' is called; this procedure thus makes a good place to set a breakpoint when debugging. Most runtime failures occur due to genuine problems; some occur during normal processing (e.g., end-of-file detection).
Ideally, the ILU runtime would report all failures to the application, in the way most appropriate for the application's programming language. Sadly, this is not yet the case.
The ILU runtime kernel has three kinds of runtime failures:
The second kind is being eliminated. The first kind is being reduced, and might also be eliminated.
The application can specify how each of these three kinds of runtime failures is to be handled. The choices are:
SEGV
signal by attempting to write to protected memory. This
is useful for generating core dumps for later study of the error.
sleep(3)
repeatedly. This option is useful
for keeping the process alive but dormant, so that a debugger can attach to it
and examine its "live" state. This is the default action for all three kinds of failures.
An application can change the action taken on memory failures by
calling ilu_SetMemFailureAction
or ilu_SetMemFailureConsumer
.
[ILU kernel]: void ilu_SetMemFailureAction ( int mfa )
Locking: unconstrained
Calling this tells the ILU kernel which drastic action is to be
performed when ilu_must_malloc
fails. -2 means to print an explanatory message on stderr and then coredump;
-1 means to print an explanatory message on stderr and then loop forever in repeated calls to sleep(3)
; positive numbers mean to print an explanatory message on stderr and then exit(mfa)
.
The default is -1.
[ILU kernel]: typedef void (*) (const char *file, int line) ilu_FailureConsumer
A procedure that is called when the ILU kernel can't proceed. This procedure must not return.
[ILU kernel]: void ilu_SetMemFailureConsumer ( ilu_FailureConsumer mfc )
Locking: unconstrained
An alternative to ilu_SetMemFailureAction
: this causes mfc to be called when ilu_must_malloc fails
.
Similarly, an application specifies how unrecoverable runtime consistency check failures are to be handled by calling ilu_SetAssertionFailureAction
or ilu_SetAssertionFailConsumer
, which are exactly analogous to the procedures for memory failure handling. For recoverable consistency check failures, an application can call ilu_SetCheckFailureAction
or ilu_SetCheckFailureConsumer
.
[ILU kernel]: void ilu_SetCheckFailureAction ( int cfa )
Locking: unconstrained
Calling this tells the runtime which action is to be performed
when an internal consistency check fails. -3 means to raise an
error from the kernel (without necessarily printing anything); -2
means to print an explanatory message to stderr and then
coredump; -1 means to print and then loop forever; non-negative
numbers mean to print and then exit(cfa)
; others number reserved.
The default is -1.
[ILU kernel]: typedef void (*) (const char *file, int line) ilu_CheckFailureConsumer
A procedure for handling an internal consistency check failure. If this procedure returns, the consistency check failure will be raised as an error from the kernel. @end deftypevr
[ILU kernel]: void ilu_SetCheckFailureConsumer ( ilu_CheckFailureConsumer cfc )
Locking: unconstrained
An alternative to ilu_SetCheckFailureAction
: this causes cfc to
be called (and no printing); if cfc returns, an error will be
raised from the kernel.
For language mappings consistent with CORBA,
the third kind of failure is reported as an occurrence of the CORBA system exception internal
,
with a minor code that encodes the filename and line number where the consistency check occurs.
The coding is this: 10,000*hash(filename, 32771) + linenum + 1,000.
The directory part, if any, is stripped from the filename before hashing.
To aid in decoding these minor codes, ILU includes the program decoderr
, which is used like this:
% decoderr 269211234 269211234 = line 234, file $ILUSRC/runtime/kernel/call.c
If a reportable consistency check failure occurs in a file not anticipated in the construction of decoderr
, you'll see something like this:
% decoderr 60612345 60612345 = line 1345 in unknown file (that hashes to 6061)
The program iluhashm
can be used to hash given filenames, so you can search a set of candidates for the mysterious hash code:
% iluhashm 32771 ../cpp/foobar.cpp ../cpp/barfoo.cpp /* Generated at Mon Dec 11 22:44:47 1995 with modulus 32771 */ { 6061, "../cpp/foobar.cpp"}, { 13273, "../cpp/barfoo.cpp"},
Users often run into the same difficulties other users have had. This section lists some of these common problems, and describes the possible cures.
Problem: A server cannot publish an object or a client cannot lookup an object.
Discussion: When using the shared file approach for simple binding, the machines on which the client and server programs run must have some shared filesystem. Each program must also have the environment variable ILU_BINDING_DIRECTORY set to a directory within that file system where the publications will be written and read.
Problem: It seems that ILU is contacting the wrong server, or if I look at the SBH's for objects that I know are coming from one source, ILU thinks they're from someplace else.
Discussion: This is usually caused by creating multiple ilu_Server's (e.g. in C, the thing you get back from ILU_C_InitializeServer (...)) that have the same server ID. The server ID should be unique. To understand why, consider what ILU does when an non-local object reference (an SBH) comes in off the wire. ILU looks at the reference and checks to see if it has a surrogate ilu_Server with that name. If not, ILU creates one, and (important point) stores away the contact information for that ilu_Server. If it already has one with that name, ILU assumes that that is the ilu_Server for the object - it doesn't check to see if the contact info is different. Thus, operations directed at objects who are served by that particular ilu_Server will always be directed at the ilu_Server that ILU saw first. [ILU could potentially keep track of multiple contact infos, but that still wouldn't help to disambiguate where operations should be directed.] This is why in many of the example programs, you see server ID's being created using some combination of a fixed string, and a the name of the host the process is running. Of course, if you run multiple instances of the example on the same machine, you would want to also incorporate some process or thread information. You can also simple let ILU generate a server ID for you, that is unique with a high degree of probability.
Problem: My process 'A' has an object reference to an object 'O' in process 'B'. Process 'B' exits, and then restarts. Even though the server name and object identifier for 'O' are the same as the first time around, process A is unable to perform operations on 'O'.
Discussion: The answer here is similar to the answer to the "It seems that ILU is contacting the wrong server... " problem. You're probably letting ILU choose the server's port number by either letting ILU use it's defaults, or by specifying a 0 in the port field in the transport specification when creating the ilu_Server. ILU in process 'A' caches the contact information from the first process 'B'. When process 'B' comes back up, the port number is different. You can specify what port should be used in the transport information to prevent this from happening. For example, to always come up at port 1234, use "tcp_0_1234".
Problem: How come a lot of things seem broken in the C++ support!
Discussion: Yes, we know! The C++ support is in the process of being completely redone for ILU release 2.0. As such, we allowed the current support to remain in disrepair, and don't bother to fix much of anything in it. Use the current C++ support at your own risk!
Problem: I'm having problems importing ILU into Python.
Discussion: (Where ILUHOME represents where you installed ILU) You need to have the ILUHOME/lib directory on your PYTHONPATH environment variable. Also, ensure that ILUHOME/bin is also on your PATH environment variable.
Problem: I'm in Windows, and trying to build some of the examples and I get complaints that it doesn't know how to make some of the files.
Discussion: The Windows make files are not set up to run the language stubbers. You must run the stubbers manually before doing the make. e.g. c-stubber Test1.isl
Problem: I'm on Unix (most probably Digital's), and my program sometimes exits unexpectedly.
Discussion: You may be running into a problem where a PIPE signal is generated and the established action is to exit the program. In the ILU source file runtime/kernel/bsdutils.c, the function _ilu_HandleSigPIPE tries to set up the process to ignore SIGPIPE. However, it only does this if the initial SIGPIPE is not SIG_DFL. (You should see an error message if _ilu_HandleSigPIPE can't setup to ignore SIGPIPE) On some systems it has been noticed that even though the application did not explicitly set up a SIGPIPE handler, the initial SIGPIPE is not SIG_DFL, and the handler that runs terminates the program. A workaround to this problem is to either set the SIGPIPE handler to SIG_DFL yourself before the _ilu_HandleSigPIPE function runs, or set it to a handler that does nothing with the signal.
Go to the previous, next section.