[Gambas-user] gb.web.feed not stable yet?

Tobias Boege taboege at gmail.com
Sat Jan 5 17:41:37 CET 2019


On Fri, 04 Jan 2019, T Lee Davidson wrote:
> > There are two "milestones" for the component:
> > 
> >   (1) Every RSS date must be formatted according to (a minor variant of)
> >       RFC 822, which includes a timezone string. I think if you publish
> >       e.g. local news, setting a timezone independent of the local one
> >       is an important feature. I wanted to do this since the beginning [1]
> >       but it *may* break the existing interface [2]:
> > 
> >         About the incorporation of timezones in gb.web.feed (the last
> > 	thing before I mark the component as "Unfinished but stable"),
> > 	my plan is to replace the Date variables in the Rss* classes
> > 	by an RssDate compound, consisting of a normalised Date and a
> > 	Timezone string (or constant), with an "apply timezone" method
> > 	probably. If you have a better idea, please let me know.
> 
> I may be missing a particular need in regards to this, but I would just store dates as either RFC822 strings or rely on the
> internal UTC representation and convert on retrieval/output (ie. a _get) as needed.
> 
> > 
> >       I also remember issues with the date conversion, some of which got
> >       fixed, but others not [3,4]. As I said 20 months ago, when this
> >       is done, the component can be marked "Stable".
> 
> I find only one issue (see below *) with the questions you raised at your ref #3. Therefore, I am unsure exactly what needs to
> be fixed. And, as much as I try, I cannot understand the need to separate the timezone (ref #4). It is available in both a
> RFC822 date string and in the internal date representation as UTC.
> 

The need arises like this: when you read an RSS document, you likely want
to get a Date, not a string describing the date. That's how we deal with
dates in Gambas. But then, "time is absolute in Gambas", as often repeated
by Benoît, i.e. a Date object points to a point in time, it has no need for
a timezone component. The timezone must be dealt with in input/output,
as you said. Now imagine I want to run an RSS-based newspaper "Magdeburg Times"
for the German city I live in. My server happens to be located in Australia
with an Australian locale and timezone. I want the news items to be PubDate'd
according to what readers expect (Europe/Berlin), but e.g. the LastBuildDate
should be in the local timezone of the server because that's an event actually
happening on the server.

If that doesn't convince you, consider a feed aggregator service, which
reads multiple feeds online (like all Gambas-related forums, blogs and the
gitlab repo feed) and provides them as a single feed ("Gambas Today").
All these feeds can have different timezones and the current implementation
of gb.web.feed would just forget all of them. The dates displayed in the
aggregated feed would point to the same *points in time* but they would
all be relative to the timezone of the machine doing the aggregation.

My point is: timezones bear meaning and gb.web.feed should be able to
preserve them. Thus I want the ability to *independently* set the timezone
of every single Date field in the entire feed on output. [ Note that input
is already covered by Gambas: it can read a timezone'd date and create a
matching (absolute) Date object that we all like to work with, but not
so much like to display, as witnessed by this mailing list :-) ]

Yes, I know that making every Date into a (Date, TimeZone) compound
may be too intrusive for something that people usually don't care about.
But what else can I do? It would turn code like

  $hRssItem.PubDate = Now

into

  $hRssItem.PubDate.Date = Now      ' and optionally
  $hRssItem.PubDate.Zone = "+0100"  ' or:
  $hRssItem.Pub.Date = Now
  $hRssItem.Pub.Zone = "+0100"

If that's too ugly, I could also leave PubDate alone and the compound
becomes a second property PubDateTime(?) which takes precedence over
the sole PubDate field if it is set or is synchronized with it.
That way you could use PubDate if you don't care about timezones but
if you read an RSS document, the parser will always fill both fields,
so an aggregator would preserve the timezone out-of-the-box.

Since you sound like a potential user, what would you like to use?

> In ToRFC822(), a pre-correction with, "System.TimeZone / 86400" is necessary due to the localization adjustment that the
> subsequent Format() will apply. If this correction were not made prior to passing the date-time's component value to Format, the
> resulting timezone would be off by -(System.TimeZone / 86400).
> 
> Regarding FromRFC822() _as I understand it_, since dates are stored internally as UTC [0] and the parameters for the Date()
> function assume local time [1], "dDate -= Frac(Date(Now))" is necessary to adjust the localized dDate to a correct UTC
> representation. Then, "dDate += GetRFC822Zone(aDate[6])" adjusts the (now correct UTC) date for the requested timezone
> (summation is used due to fZone being defined as the appropriate positive or negative offset).
> 

While all of that sounds correct ...

> * However, there does seem to be an issue with FromRFC822():
> 
> [Code]
> Public Sub Main()
> 
> Dim sDate As String = "Sun, 21 Apr 2019 05:00:00 GMT"
> Dim dDate As Date
> 
> dDate = Date.FromRFC822(sDate)
> Print CFloat(dDate)
> Print Frac(dDate) * 24
> Print Date.ToRFC822(dDate)
> Print Date.ToRFC822(dDate, "EST")
> Print Date.ToRFC822(dDate, "+0200")
> ' Get timezone From RFC8222 Date string
> Print Split(sDate, " ").Last
> 
> End
> [/Code]
> 
> [Result]
> 2490699.20833333
> 5.00000000372529
> Sun, 21 Apr 2019 05:00:00 GMT
> Sun, 21 Apr 2019 00:00:00 EST
> Sun, 21 Apr 2019 07:00:00 +0200
> GMT
> [/Result : Correct]
> 
> With `Dim sDate As String = "Sun, 21 Apr 2019 00:00:00 EST"`
> [Result]
> 2490698.79166667
> 18.9999999962747
> Sat, 20 Apr 2019 19:00:00 GMT
> Sat, 20 Apr 2019 14:00:00 EST
> Sat, 20 Apr 2019 21:00:00 +0200
> EST
> [/Result : Incorrect]
> 
> Perhaps my understanding is quite faulty.
> 

... my understanding is the same. In short:

  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0100"), "+0100")
  > Sat, 20 Apr 2019 07:00:00 +0100

seems to be buggy (look not only at the time, but also at the date!).
The bug is in this line right after a local Date object is created from
the parsed items:

  dDate -= Frac(Date(Now))

After digging through Frac and Date documentation, I guess this is supposed
to subtract the local timezone. But it is wrong. A Date has a Float represen-
tation which is automatically used when you perform direct arithmetic on it,
or when you call Frac. This float has the number of days since the epoch in
the date as integral part and the fraction milliseconds of the time component
of the date divided by 86400000 (all milliseconds in a non-leap-second day)
as the fractional part. Date(Now) (or really just the built-in Date function)
returns the current date with zeroed out time -- zeroed out *local* time.
The internal float will still have the timezone attached to it in some way.
Let's see:

  Print Frac(Date)
  > 0.95833333348855

This is in +0100. My timezone is what is missing in this number to reach
an integral 1.0, which makes sense, I guess, since my timezone must be
subtracted to yield UTC. 0.95833 is the time of today midnight a.m.
in +0100 when seen in UTC. The crucial point is that Frac(Date) is a
*fraction modulo 1.0*. This explains why your tests (I expect you are in
EST?) weren't as drastically broken as mine (almost always losing a day):

  - The Frac(Date) on a system in the EST (-0500) timezone is a small
    positive number which is what must be subtracted to annihilate the
    local timezone.

  - The Frac(Date) on a system in the CET (+0100) timezone is a large
    number below 1.0 which represents a *negative* fraction.

You don't want to subtract 0.95833 (almost always losing a day), but you
want to subtract "0.95833 modulo 1.0", which should actually be adding
0.04167 in this case.

I pushed some fixes, including this one, to gb.util in 3a16b32ad.
The component now does the following, which feels correct to me:

  Print CStr(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0000"))
  > 04/21/2019 05:00:00

Because Print(CStr) should display the date in UTC and since the
Date was given in UTC, the time should not change.

  Print Format$(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0000"))
  > 04/21/2019 06:00:00

5 o'clock in UTC is displayed in my local timezone +0100.

  Print Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0100")
  > 04/21/2019 05:00:00

5 o'clock in my local timezone display as you see it in my local timezone.

  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0000"), "+0000")
  > Sun, 21 Apr 2019 05:00:00 +0000
  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0100"), "+0100")
  > Sun, 21 Apr 2019 05:00:00 +0100
  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0800"), "+0800")
  > Sun, 21 Apr 2019 05:00:00 +0800

If source and destination timezone are the same, the string shouldn't change.

  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0000"), "+0100")
  > Sun, 21 Apr 2019 06:00:00 +0100
  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0000"), "+0800")
  > Sun, 21 Apr 2019 13:00:00 +0800

You gain hours when you convert from UTC to a "+" timezone (remember that
Sydney celebrates New Year's early).

  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0800"), "+0000")
  > Sat, 20 Apr 2019 21:00:00 +0000
  Print Date.ToRFC822(Date.FromRFC822("Sun, 21 Apr 2019 05:00:00 +0800"), "+0100")
  > Sat, 20 Apr 2019 22:00:00 +0100

Converting from a high "+" timezone to a lower one will lose you hours
(again, if you live in Germany, you'll watch Sydney New Year's fireworks
on TV in the afternoon of the 31st).

Now that this is hopefully out of the way (can you check if you read this,
Benoît?), let me know about your vote on the desired interface for timezones
in gb.web.feed and I can mark the component stable today (or tomorrow,
depending on your timezone).

Regards,
Tobi

-- 
"There's an old saying: Don't change anything... ever!" -- Mr. Monk


More information about the User mailing list