[Gambas-user] Socket Limitations

Sun Jan 3 13:28:52 CET 2010

2010/1/3 Doriano Blengino <doriano.blengino at ...1909...>:
> Kadaitcha Man ha scritto:
>> 2010/1/3 Doriano Blengino <doriano.blengino at ...1909...>:
>>
>>> After a few minutes I suggested that a timeout could simplify things, I
>>> changed my mind. It would not be a totally bad idea but, as most other
>>> mechanisms, it has its problems. First, what is the right timeout?
>>>
>>
>> It is either 0 for no timeout or it is set by the application.
>>
> Uhm... I see the point.

:)

I figured you would. There was a lot more I could have said but chose not to ;->

> I intended that the timeout would be set by the
> application. Nevertheless, timeouts are often stupid,

No, not at all. I worked in real-world computing for 37 years before
retiring. If I have a 200GB/second fibre-optic connection to the net
but am trying to communicate with a 150 baud acoustic modem sitting in
the wilds of Timbuktu in Africa then I should know that, on average,
it takes, say, fifteen minutes to transfer 0.5KB of data, for example.
If the data has not been sent after, say, 20 minutes then I know I
ought to disconnect and try again later. Without a timeout I can't
make any such decision; the gambas socket is making decisions it
shouldn't be making.

Timeouts are not stupid, Doriano. It reasonable for me to ask a socket
to tell me if the connection has been idle for such and such amount of
time while it was trying to send or receive data.

> and should only be
> used to raise an error,

Yes.

> not as part of the normal logic of a program.

No. As soon as the error is raised, the normal logic of a program
takes over and decides what to do about the error.

> For example, in your application: what is the right timeout? May be a
> few seconds, but if there is something 1Mib to send to a very busy
> server, on a slow connection, some minutes would be required.

No. The timeout is not applied to the length of time it takes to
transmit any data, it is applied to amount of time that there is no
response between the two connected entities while trying to send or
receive data.

Nevertheless, to answer your question, my testing tells me that, for
all the servers I might connect to, if a transfer is silent for more
than 30 seconds then I have a problem to deal with. The protocol says
I have to disconnect and try again, and I know that is correct because
the protocol also says that the remote server will junk my
transmission unless it receives a certain signal from me that I have
finished sending data, and the protocol also tells me that the server
will acknowledge receipt.

> Would you
> set the timeout to some minutes? Ok, let's do some minutes (we don't
> want the program fail, if there is no real reason; a slow connection is
> not a good reason to fail, right?).

Yes it is a good reason. What if the customer you are writing the
program for makes it a requirement that if the data is not
successfully transferred and acknowledged as received within 30
seconds then the program must write a log record?

What if the customer uses that log record to determine possible faults
on their network, or to make decisions about upgrading to a faster
link? What if the customer believes there is a problem at 4PM every
day when everyone in a certain office does their final updates and
they all send their data to a central server at the same? How is the
customer ever going to know for sure that 4PM really is a problem?

> Now improve your example
> application; instead of sending a single message to a single host, it is
> a proxy which accepts several incoming messages and deals them to
> several hosts. If at a certain point a remote host is very busy (or
> down), your proxy ceases to work because it is blocking, for several
> minutes. You don't want to do so, so you need non-blocking sockets. The
> timeout of the socket is still there, because sooner or later the socket
> will have to raise an error, but your application won't stop to work.
> This is how I mean timeout - only used for error recovering.

Yes, I agree. All you need do is raise a timeout error. The program
will decide what to do about it.

> But the
> first time I suggested a timeout, it was related to the program logic.

Oh, I see. No, all you need do is raise an error that can be trapped.
The program then decides what should be done.

> It can slightly simplify things, but at a cost - a possible incorrect
> behaviour. Using timeouts for communication can be a good idea only in
> precise situations, but gambas can not know what the situation is, it is
> a general purpose programming language.

No programming language can know what the situation is or how to deal
with it, only the program can know that, and that is why I was saying
the socket should not be making decisions it has no right to make.

> So, I think, gambas could also implement timeouts, but it would be
> responsibility of the user to use them in the correct way.

Yes. I 100% agree. However such musings are not necessary because the
gb3 socket already supports blocking, and as I found out today, it is
my responsibility to implement it. Anyway, it is always, without fail,
the responsibility of the programmer to write the program. It is not
the responsibility of the programming language to make implementation
decisions for the programmer, which is exactly what gb3 is doing by
not supporting timeout, making decisions it should not be making.

>> http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.receivetimeout%28VS.80%29.aspx
>> http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.sendtimeout%28VS.80%29.aspx
>>
>> $ man socket
>>
>>        SO_RCVTIMEO and SO_SNDTIMEO
>>               Specify  the  receiving  or  sending timeouts until reporting an
>>               error...
>>
>> perl:
>> timeout([VAL])
>> Set or get the timeout value associated with this socket. If called
>> without any arguments then the current setting is returned. If called
>> with an argument the current setting is changed and the previous value
>> returned.
>>
>> As you can see, the idea of a timeout is not a strange one to many
>> languages on either Unix, Linux or Windows. In fact, I'd say it is an
>> absolute necessity. And if you are using the OS socket, which you seem
>> to be doing, then why should Gambas hide a property that is already
>> available to C/C++ and even script programmers?
>>
>>
>>> Second, if a timeout occurs, how many data has been sent?
>>>
>>
>> Again, that is not the business of the socket. The business of the
>> socket is alert the program that a problem exists, nothing more.
>>
> Here I would say a concise *no*. From ground up, things works like this:
> "send as many data as you can, and tell me how much you sent". In fact,
> the lowest level OS calls work like this: the return value is the number
> of bytes written or read (not only for sockets, even files, standard
> input and output, and so on). Of course you can use blocking mode, and
> rely on the fact that when the system call returns, either it failed or
> it wrote all the data. But the blocking mode is not always the best way
> to go.

Of course it isn't, and that's why I said the socket should not be
making the kinds of decisions it is.

>>> anyway, this is a truely complicated matter.
>>>
>>
>> It is only complicated if you believe that the socket should poke its
>> nose into business it shouldn't :)
>>
>> If the connection goes belly up, the socket can, at best, know how
>> many bytes it sent into the ether, but it cannot ever know how many of
>> those bytes went into hyperspace never to be seen again. How can it?
>> It's not possible. That's why the client and server have to deal with
>> the problem between themselves.
>>
> False. TCP/IP is a very robust transport,

False. TCP/IP is a network protocol. TCP = Transmission Control
Protocol, and IP = Internet Protocol. The transport layer is contained
within the protocol.

Protocols define how two systems communicate with each other and deal
with success or failure. If you take another look at the code I
attached earlier, it is using a protocol, a defined RFC protocol (RFC
3977), and TCP/IP is also a defined RFC protocol (RFC 1122). Protocols
are the whole reason that the gambas socket should not make decisions
that the programmer should be making. Protocols define how
conversations take place between systems; protocols are the reason
that timeouts are necessary.

> I think we two are talking from two different points of view. I am
> talking about a general point of view, where a socket is used in many
> different situations, so one can not make assumptions about data size
> and timeouts.

One does not need to make assumptions. One tests and verifies, then
one sets appropriate timeouts based on empirical proof.

To be honest, and no insult intended, the only time I could ever
understand not having a timeout is if one is blindly sending and
receiving data with no protocols to define what is being sent or
received. now that's mad. Neither the client nor the server knows for
sure what the other one sent or received.

Client: [HEY, SERVER]
Server: [WHAT?]
Client: [I HAVE SOME DATA FOR YOU!]
Server: [OH HUM! OK, SEND IT, BUT TERMINATE IT WITH XYZ SO I KNOW I'VE GOT IT!]
Client: Sends data and [XYZ]
Server: [OH HUM! OK, I GOT IT!]
Client: [BYE]
Server: <hangs up the phone>

That is a protocol, as daft as it looks.

> This is the point of view of an operating system or a
> general purpose language.

The point of view of the general purpose language is irrelevant
because it should have no point of view whatsoever about how long it
should take to transmit or receive some data. Whereas the gb3 socket
has the point of view that it should take an infinite amount of time
to transfer a single byte.

> In fact, most OSes and languages let you
> specify buffer dimensions and timeouts (and blocking or non-blocking
> options, and many other). In most of your thoughts, you specifically
> refer to a single, relatively simple situation.

That's only for now. The proxy sits between unknown clients and
unknown servers that have defaults for the numbers of sockets they
will create or accept. Without a timeout I cannot have a client create
4 sockets to a remote server if the remote server only accepts two
connections from any one IP address, unless the remote server sends an
explicit rejection message for that socket, and since the remote
server is unknown, I cannot even guarantee that the remote server will
do such a thing because, even if the protocol says the remote server
must send a rejection message, I have no way of knowing that the
remote server is fully protocol compliant.

> Why not! A single
> situation is a nice situation to speak about, but there are many
> different. I think that non-blocking sockets are good for the example
> you sent in this list; but for your real application (a proxy, right?),
> a non-blocking system, with no timeouts except for errors, would be
> better suited. Just my thought.

Without a timeout I cannot create more than a single socket and be
certain that the remote server will accept it. I would be very happy
if socket (Socket and ServerSocket) accepted timeouts and raised
errors when the connection timed out, where timeout means a period of
time where there is no activity while a send or receive is in
progress. That, btw, is why Linux socket implements both a send and a
receive timeout.

Regards,