[Gambas-user] Debugging Gambas (again)

Cedron Dawg cedron at exede.net
Mon Apr 29 17:19:55 CEST 2019


First, I wish I could help the OP better.  I can understand his frustration.  If the problem is in Gambas, finding the minimal program that demonstrates it is definitely the best approach.  If that is a laborious task, I think he should ask if he can email the full code to either Tobi or Benoit to look at, rather than attach it to this thread.  He can also email it to me (cedron at exede dot net), and if I can reproduce it (3.12.2) I might be able to minimize it.

Let's return to our discussion and generalize it away from the OP's situation.  How can capacity issues lead to a seg fault?  Here's a few ways:

1) A malloc return value is not checked and a memory allocation error occurs returning a null.  The subsequent code tries to write to the NULL pointer and boom, seg default.

2) The interpreter keeps an internal stack of some type.  The calling program never calls the "close" for all the "open"s, the stack overflows, and boom, seg default.

3) A circular buffer captures sensor readings.  The process reading the buffer calculates the index into a lookup table by taking the difference of two sensor readings.  They can only change so much from reading to reading so a range check isn't made.  The sensor goes nuts, starts spewing out readings, the buffer overflows, and the lookup index is made from two readings that buffer size apart, goes out of range, and boom seg default.

How can a "bad case scenario" lead to a seg fault?  Here's an example:

Suppose the interpreter provides a "SendToRemotePeer" function that has an internal buffer for holding the message text.  The function call specifies where in the buffer to lay the text.  This code has to be fast so range checks aren't made.  The calling program passes parameters that causes an overwrite on the buffer by one character.  The bug is in the calling program, not the interpreter, boom, seg fault.  Even though it had called the same function flawlessly thousands of times before.

Now, let's talk about fragility and robustness.  The difference only becomes important when the calling program sends invalid parameters.  Shouldn't all functions be coded to be robust?  No.  Robustness comes at a cost in code size, complexity, and execution time.  Whether it is worth it in a particular location is a judgement call.

What is robustness?  "parameter validation"  How do errors get caught by the interpreter?  "parameter validation"

What happens when you fail to do "parameter validation" and invalid data gets through?  Like you said, unpredictable results, and maybe, even yes, boom, seg fault.

All of the examples I cited above could have been caught with proper parameter validation.  Fragile code is not a bug or an error.  The bug lies in the code calling the fragile code with invalid values.

Should an interpreter be coded to be robust?  A general purpose usage one, like Gambas, yep.  Should it be absolutely robust?  I'm not sure that is even theoretical possible.

>From the user point of view, does it make much difference if Gambas catches a seg fault and displays a popup, or does a parameter validation finding the error cleanly and give the user a popup.  The latter is likely to be a tad more informative of the nature of the error, but other than that there isn't really a qualitative difference.

In the OP's situation, what needs to be determined is:

1) Is it invalid data from the calling program and fragile Gambas code.


2) Is it valid data from the calling program and a bug in Gambas code.

Finally, about blowing up on the first call being a locus indicator.  If a function call blows up on the first call, you can easily determine whether the parameters are valid or not at the calling level.  So, with valid data, you know it is condition 2 above and a bug in the interpreter.  However, if the function call works a bunch of times, then fails, condition 1 becomes the more likely case.

I still probably didn't explain things clearly enough, but that's what I was saying.


P.S.  "Real Mode" was the predecessor to "Protected Mode" meaning there was no processor level out of range checking on memory referencing.  Those kinds of bugs are particularly difficult to reproduce and find.

----- Original Message -----
From: "Jussi Lahtinen" <jussi.lahtinen at gmail.com>
To: "user" <user at lists.gambas-basic.org>
Sent: Saturday, April 27, 2019 6:24:27 PM
Subject: Re: [Gambas-user] Debugging Gambas (again)

BTW, Jussi, I wrote a comprehensive byte-code interpreter in 8086 assembly in Real Mode as a DOS 3.1 TSR that was used in production for several years, circa 1988 

Segmentation fault means that the program is trying to access memory address, which it doesn't have privileges to access. So, you can right away say it is definitely not capacity issue or "bad case scenario" in the program logic, as you suggested. How could it be? Also whether the program causes crash right away or later, says absolutely nothing about where the bug is. It just cannot, not even in theory. 

Test this: 

Dim p As Pointer = 1 
Dim hStream As Stream 

hStream = Memory p For Write 
Print #hStream, "hello!" 
Close hStream 

There is no segmentation fault, but "write error" from Gambas. Why? Because if errors are not caught by Gambas, then they can lead to undetermined behaviour later. Thus this "However, adding parameter validation testing to every function call is a questionable endeavor" is also sign of not understanding at all what you are talking about. 

Once more: 
Segmentation fault = operating system tells that Gambas tried to do something illegal. 
The "clean" errors = Gambas tells the programmer, that he tried to do something illegal. 


More information about the User mailing list