[Gambas-user] [slightly OT] No Problem with Gambas or MySql!

Mon Oct 7 17:44:39 CEST 2013

Lovely


---------- Original Message -----------
From: Caveat <Gambas at ...1950...>
To: gambas-user at lists.sourceforge.net
Sent: Mon, 07 Oct 2013 14:04:55 +0200
Subject: Re: [Gambas-user] [slightly OT] No Problem with Gambas or MySql!

> Thanks Randall, we got some great discussions going here.  I think the 
> closing paragraph sums it up quite well...
> 
> "
> 
> So err on the side of caution and normalize your data until you have
> the experience needed to know when to break the rules. If you always
> normalize your data you will be correct 99.999991 percent of the time. That
> last 0.000009 percent of the time you may never get to....
> 
> "
> 
> Of course, as I later said, my pennies example was a prime candidate for 
> normalisation!  And, I think, the case Willy is dealing with would have 
> been normalised (in an ideal world) from:
> 
> | ID  | Gewicht |
>    345	 100
>    345	 100
>    345	 50
>    347	 200
>    347	 200
>    347	 200
>    347	 200
> 
> to:
> 
> | ID  | Gewicht |  Count  |
>    345	100		2
>    345	50		1
>    347	200		4
> 
> with a Primary Key composed of ID and gewicht (weight).
> But, Willy is working with a legacy system, and we don't know for sure 
> why someone decided to make (for example) 4 identical 347-200 rows, 
> instead of one 347-200 row with a count of 4.  Perhaps it's a very busy 
> system and the designer thought having many people able to create and 
> delete arbitrary numbers of ID-weight rows would make it easier to 
> manage the counts/weights in a system with many concurrent users, rather 
> than having people queue up to modify the count on a normalised 347-200 
> row?  Who knows!
> 
> Kind regards,
> Caveat
> 
> On 06/10/13 20:24, Randall Morgan wrote:
> > If you need some books on relational databases. I can give you private
> > access to my library or digital/scanned books. They will explain things
> > much better that we have here and teach you SQL. Also, know that relational
> > data bases are only one type of data base. There are other types of
> > databases not yet directly supported by Gambas. But 99% of databases are
> > indeed relational and even those that are not typically carry along of the
> > ideas from relational databases.
> >
> >
> >
> >
> > On Sun, Oct 6, 2013 at 11:17 AM, Randall Morgan <rmorgan62 at ...626...> wrote:
> >
> >> A simple explanation as to why you do not repeat data in a rational
> >> database. (Very simplistic):
> >>
> >> Rational or Relational data bases normalize data so that any usage of a
> >> single bit of data is stored only once. This allows the data to be edited
> >> in only one location and also requires the least amount of storage. By
> >> centralizing the bit of data into one location you gain power. For example,
> >> say you have a customer database where the customer(s) must enter their
> >> city each time they place an order. The customer(s). Some customers may
> >> misspell the city name, or use an abbreviation. Allowing such entries into
> >> the data would make using the address information difficult. But it would
> >> also make it difficult to search the database for all customers from a
> >> particular city.
> >>
> >> If you allow the same information in multiple places, you will cause
> >> situations where the data must be edited in multiple places or every
> >> search/query must be written to handle an almost endless list of
> >> possibilities for each value of interest.
> >>
> >> Say you allow multiple records for the same order. When you update the
> >> order status, one will show the order with the updated status and one
> >> without. Which one is correct? Do you assume the one with the lastest
> >> status is correct? What if someone in the shipping department made an
> >> error? Now you have confusing records.... You also have wasted space which
> >> cost real money for real storage, and real cpu time to locate the correct
> >> or all records, and real geo time for the user to review and decide which
> >> of the multiple records is correct. The user may also have to do additional
> >> work to combine information from multiple records to compile a complete
> >> picture of the data.
> >>
> >> By enforcing data normalization (the art of not repeating data values),
> >> you limit the amount of work both the cpu and the end user must do. You
> >> build a system where there is only one entry to edit, or update. You
> >> provide a single source and a single point of contention/resolution to
> >> conflicts.
> >>
> >> Data normalization is to data what OOP is to programming. It is not
> >> perfect, and can be taken to extremes. But it is a valuable tool and should
> >> be used in both coding and data storage/retrieval. It is not the only tool
> >> and there are the rare edge cases where one may need to break the rules.
> >> But I can pretty much guaranty you that if you don't understand why you
> >> need it, you don't yet understand when you should and should break the
> >> rules. So err on the side of caution and normalize you data until you have
> >> the experience needed to know when to break the rules. If you always
> >> normalize your data you will be correct 99.999991 percent of the time. That
> >> last 0.000009 percent of the time you will may never get to....
> >>
> >>
> >>
> >>
> >> On Sun, Oct 6, 2013 at 2:55 AM, Charlie Reinl <Karl.Reinl at ...9...>wrote:
> >>
> >>> Am Sonntag, den 06.10.2013, 10:55 +1030 schrieb Bruce:
> >>>> On Sun, 2013-10-06 at 00:58 +0200, Willy Raets wrote:
> >>>>> On Sun, 2013-10-06 at 00:05 +0200, Caveat wrote:
> >>>>>>> If you wanted to delete one row with values 347 200, then which
> >>> row would it be?
> >>>>>> It *doesn't matter*!  The rows are the same, so pick a row, ANY row!
> >>>>> That is exactly what I have been doing for the last 10 years.
> >>>>> In VB code I just looked for the criteria in the table and deleted the
> >>>>> first record that met the criteria, next update table and ready.
> >>>>>
> >>>>> In Gambas I tried the same, now with the table being in MySql, and it
> >>>>> didn't work.
> >>>>>
> >>>>> It never, ever failed me in those 10 years, even without a primary
> >>> key.
> >>>>> Although most other tables do have primary keys and relations set up,
> >>>>> there never was a need for this table to have a primary key.
> >>>>> And I still do not see the need for a primary key.
> >>>>>
> >>>>>>> Without the primary key, you're asking MySQL to guess, and it
> >>> wont.
> >>>>>> Yes it will, if you use LIMIT (see my previous two mails).
> >>>   Otherwise, it'll happily delete all the rows that meet the criteria,
> >>> whether they're duplicates or not.
> >>>>> And even in MySql there is NO need for the table to have a primary key
> >>>>> as the 'SELECT ... limit 1' will do what I expect, delete only 1
> >>> record
> >>>>> that meets the criteria.
> >>>>>
> >>>>>>>> "duplicate rows are and always were a mistake in SQL", C.J.Date
> >>>>> Duplicate rows are a reality in real life data gathering.
> >>>>> I have been working with databases since dbase somewhere in the '80's
> >>>>> and I have had to deal with real life situations concerning data
> >>>>> storage, not with theory for ideal models. Not saying I haven't read
> >>> my
> >>>>> share on relational database design and take it into account when
> >>>>> designing a database.
> >>>>>
> >>>>>> The guy has obviously never read William Kent's "Data and Reality".
> >>> It's
> >>>>>> a book I'd still recommend to anyone working with data and
> >>> particularly
> >>>>>> databases, even though it was written 35 years ago!
> >>>>>>
> >>>>>> <RANT>
> >>>>>> Why *must* every record be uniquely identifiable?  Explain why...
> >>> don't
> >>>>>> just throw random snippets of relational database theory at me!  If
> >>> I
> >>>>>> ask you to model the contents of your pockets on a database, are you
> >>>>>> going to attach some artificial key to each individual penny?  Why?
> >>>>>> They each buy the same number of penny chews (showing my age!), and
> >>> if
> >>>>>> you take 3 pennies away, you've still got 3 pennies less than you
> >>> had
> >>>>>> before, no matter how carefully you select those 3 pennies!
> >>>>>> </RANT>
> >>>>>>
> >>>>>> <HINT>
> >>>>>> A tuple is, by definition, unique.  If I have 8 rows on my database
> >>>>>> representing 8 real-world penny objects and I can delete 3 of
> >>> them...
> >>>>>> are the remaining 5 rows tuples?  How closely does the database
> >>> model
> >>>>>> reality in this case?
> >>>>>> </HINT>
> >>>>> I like your penny story ;)
> >>>>>
> >>>>>
> >>>> [ASIDE!]
> >>>>
> >>>> Interesting discussion, but someone recently asked me why I prefer
> >>>> postgresql over mysql and this thread reminded me that I never answered.
> >>>>
> >>>> One of those reasons is that this whole situation cannot occur in
> >>>> postgresql.  It has been designed such that it cannot. Considering a
> >>>> non-normalised table, i.e. one that has no natural (unique) primary key,
> >>>> all rows are still capable of being operated on using the standard sql
> >>>> commands. However, in that case, *all* rows that succeed the WHERE
> >>>> filter clause will be acted on. Fullstop.
> >>>>
> >>>> Putting that another way, postgresql "doesn't really care" whether the
> >>>> table has a primary key set or not. Nor does it care whether a delete or
> >>>> an update query uses a unique row identifier.  That sounds pretty slack,
> >>>> doesn't it?  Well, not at all really.  It relies on the WHERE clause to
> >>>> identify the set of rows to be operated on and that is all.  If such a
> >>>> clause identifies multiple rows then they will all be affected. So, to
> >>>> continue the coin collection analogy and a theoretical sql command:
> >>>> SELL COIN WHERE YEAR="1930" AND ALLOY="Cupronickel";
> >>>> would sell all the coins that match.  The idea that you could write a
> >>>> command like:
> >>>> SELL THE FIRST COIN YOU FIND WHERE YEAR="1930" AND ALLOY="Cupronickel";
> >>>> really, to me brings a sense of randomness into the game ... ??? What is
> >>>> the dbms supposed to imagine "THE FIRST COIN YOU FIND" to mean?
> >>>>
> >>>> Here in the thoroughbred racing industry, we have a somewhat similar
> >>>> real world situation.  All thoroughbreds have a registered name and
> >>>> there are strict controls over the reuse of a name.  However, horse
> >>>> owners are allowed to change the registered name of the horse
> >>>> occasionally and under certain conditions. All pretty good and you could
> >>>> imagine that the name "Lemon Liniment" would uniquely identify a
> >>>> specific horse from any of the other 61,000 we have on our database.
> >>>> Further you might imagine that if the owners changed its' name to
> >>>> "Cherry Zontop" then all we would have to do is, once the name change
> >>>> has been registered, change that "unique" primary key.  Hah ha!  No such
> >>>> simplicity exists on the planet.
> >>>>
> >>>> First off, the horse is really and truly identified by its microchip ID.
> >>>> This is what the register uses as the unique identifier.  However, they
> >>>> don't publish that (and for good reasons if you think about it ...) and
> >>>> secondly name change events are frequently if not often publicised some
> >>>> time after the actual change has been registered.  This is a major,
> >>>> major problem, (to us, logistically not technically).
> >>>>
> >>>> Suppose Jack (the owner of Lemon Liniment) sells it to Jill (on
> >>>> Wednesday) who re-registers it as Cherry Zontop.  Jill then races it
> >>>> once on the next Saturday and we see a race entrance and result appear
> >>>> in the official race records for Cherry Zontop.  We think that Cherry
> >>>> Zontop is a "new untried" horse and add it as a new horse in the
> >>>> database.  On the Monday (or later) the registry publicises the name
> >>>> change...
> >>>>
> >>>> Under MySQL merging the histories of Lemon Liniment into Cherry Zontop
> >>>> was a technical nightmare that essentially required offloading and
> >>>> reloading multiple sets of rows from multiple tables*.  Under postgresql
> >>>> it is the execution of one SQL command:
> >>>> UPDATE horse SET name='Cherry Zontop', namechanged=TRUE, oldname='Lemon
> >>>> Liniment' WHERE name='Lemon Liniment';
> >>>>
> >>>> * I have simplified the story a fair bit, there are around 18 underlying
> >>>> tables that have the horse name as the foreign key or as part of the
> >>>> foreign key.
> >>>>
> >>>> postgres, because it properly supports "ON UPDATE CASCADE" constraints
> >>>> on those tables and their foreign key columns handles it all internally.
> >>>>
> >>>> Anyway, as I said, [ASIDE] and just one of the reasons for preferring
> >>>> postgresql.
> >>>>
> >>>> regards
> >>>> Bruce
> >>> Salut Bruce,
> >>>
> >>> how do you look back to 'Lemon Liniment' history ?
> >>>
> >>>
> >>> --
> >>> Amicalement
> >>> Charlie
> >>>
> >>>
> >>>
> >>> ------------------------------------------------------------------------------
> >>> October Webinars: Code for Performance
> >>> Free Intel webinars can help you accelerate application performance.
> >>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> >>> from
> >>> the latest Intel processors and coprocessors. See abstracts and register >
> >>>
> >>> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
> >>> _______________________________________________
> >>> Gambas-user mailing list
> >>> Gambas-user at lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/gambas-user
> >>>
> >>
> >>
> >> --
> >> If you ask me if it can be done. The answer is YES, it can always be done.
> >> The correct questions however are... What will it cost, and how long will
> >> it take?
> >>
> >
> >
> 
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
> _______________________________________________
> Gambas-user mailing list
> Gambas-user at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------