📜 ⬆️ ⬇️

GUID Gospel

Understanding the new Visual C # 2008 (it is so free for novice developers that I could not resist), I found a new word for myself in science and technology - GUID.

I give an example of an interesting, I think, article calling for the use of globally unique identifiers in all spheres of the national economy. The article is mainly about .NET and other microsoftware, but I think it will be useful to many here, because there are GUID implementations in almost all modern databases and languages ​​(including mySQL and PHP;).

PS: If it is interesting, then I will post a translation of the second part, where the author responds to comments on the first article.

')

GUID Gospel


In the Gospel of GUID there is only one commandment:

I. Always use the GUID to uniquely identify a table row.


When accepting new employees to the team, this is one of the first rules I teach them. Almost always, at first, they look at me with the look of a puppy with protruding ears and head bent to one side, as if to say “how is it?”

Therefore, I show them that in each table there is a field that is a unique identifier, and in most cases a primary key and, usually, a clustered index. For example, in the Employee table, we will have the EmployeeGUID field, a unique identifier and a clustered primary key. Here, I usually get back a glance "oops, my new boss is an idiot, but I don't want him to know that I think so." Then come questions and comments like "but I usually do this:"
  1. I use int
  2. I don't use GUIDs because they are so big.
  3. Don't you understand how difficult it is to find a record by such an id, in the case of int it is much easier
  4. Well, you will not use them in a table that implements a many-to-many relationship?
  5. What happens if GUIDs run out?
  6. I'm not sure that they will not be repeated.
  7. I have never heard of anything like this, so most likely this is a bad thought.
  8. And it does not worsen the performance?

And my mission on treatment begins. I have good statistics on the treatment of "thinking" people, so I urge you to read this article and maybe you will also become a believer in the GUID!

1) I do not need to make additional samples, and this is an increase in productivity!


There are many reasons for using a GUID as a primary key. The main thing for me is directly related to how I build object models. I prefer to create a “new” instance of an object without making a selection. So, by creating an Order object, I will not access the database to get an OrderID (OrderGUID in my world), as I would have done with the int OrderID. At this level is not too impressive, right? Think about this: I create an Order object with an OrderGUID, then OrderLineItem objects (an order line) with an OrderLineItemGUID without a ONE break in accessing the database. In the case of int, I would make 11 hits.

2) Combining data is so simple that it turns out even Mac developers!


The next reason to always use a GUID is data merging, which has proven necessary countless times. Before I saw the light, I also used int or something else to make the string unique, but when I had to merge data from wounded sources, I did special transformations.

Imagine:

DB1 ( 1):
Order ( )
OrderID = 1
CustomerID = 1

DB2 ( 2):
Order
OrderID = 1
CustomerID = 1


If Client 1 acquires Client 2 and I need to merge their data into a single database, I will have to change someone's OrderID and CustomerID for some int values ​​that I’m not using, then update a large number of records, and maybe I’ll dance with tambourine and with reference values ​​(seed values). Multiply this by dozens of tables, take into account the millions of rows of data, and see that I am facing a REALLY challenging task that will require dofig testing after writing SQL and / or code.

However, if I follow the gospel of the GUID:

DB1 ( 1):
Order ( )
OrderID: {C290A815-BAAA-4455-82EA-F18F5D0CEF2E}
CustomerID: {52335ABA-2D8B-4892-A8B7-86B817AAC607}

DB2 ( 2):
Order
OrderID: {79E9F560-FD70-4807-BEED-50A87AA911B1}
CustomerID: {60FA3045-BAE8-4526-87A2-DF22ED5F093B}


In this case, all that needs to be done is reduced to the usual insertion of all rows from one database to another. No transformations, no confused tests, simple, convenient and effective. Recently, I had to do this operation with the database of my two clients, AT & T and Cingular. The whole "transformation" took 45 minutes.

Another simple example: Imagine that your clients often work remotely offline, and you have to upload their data to a common database when connected. Now it's easier than taking a candy from a child ... © If you believe in a GUID. You can easily drag data between databases.

3) Tipo-independence


The third reason why I believe in a GUID is what I call “type-independence” (Type Ignorance is a pun built on Type Inference in .NET 3.5). Its essence is that it does not matter to me how the data is stored in each database table. Imagine that we have a product that has been in commercial use for some time. The database has tables Customer, Order, OrderLineItem, Product (Product) and Vendor (Supplier). Now it turned out that you need to add "notes" to each type of object. For this it is enough just to create a table.
Notes
NoteguidUnique Identifier
ParentGUIDUnique Identifier
NoteVarChar (500)

Now you can insert notes into this table using the GUID of the objects themselves in the ParentGUID. To get all the notes for a particular product, the simplest selection of notes is done by its GUID.

For example, to get all the notes by supplier, you just need to create a simple (join) Note.ParentGUID to Vendor.VendorGUId. No type indicators are needed, no need to invent which tables to link, no need for heaps of reference tables to understand what type of object a row is associated with.

You will be surprised to know how often this small technique is used. Recently, we added an “audit” to one of our applications, in which we wanted to find out who deleted what, added or changed in the database. We simply added a few lines of code to the DataContext SubmitChanges () method (we use only LINQ in this application) to create the corresponding entry in the audit table. When a new object or type is created in the application, the entry in this table occurs automatically, which allows us not to sweat by writing a special “audit” code when adding new data types in the application.

4) Blow, blow, still blow ... (c)


There are many less obvious reasons for using a GUID, but there is one that I did not foresee in advance and for which I thank the GUID, for he and he alone saved millions of dollars to my client ... yes, I told MILLIONS!

I developed a system of managing automatic payments for advertising for a large customer. They had to be able to pay bills with a total amount of millions of dollars at the press of a button. In a nutshell, at the press of a button, our system generates a file with a queue and sends it to their payment server, which will generate checks ... and the money will be gone. Of course, I used the GUID to identify anything and everything, so when the payment server generated the reconciliation file, I could easily run it through my database.

A working client database and a test database, a slightly outdated copy of the working one (for a couple of months) was deployed on our website. In the process of testing, someone on their side saw one of our test files with a queue of payments and, without hesitation, fed them to the payment server. Well, then you understood ... The client paid a heap of valid content providers twice (once on a real request, a second time on a test one), and also not quite normal suppliers (for example, those that did not advertise anymore, because the test database was outdated on a couple of months). So, without any jambs on my part, I got a terrible dustbin in the data ... well, at least my client thought so. However, since all my payment records had GUIDs, I could easily select those records that came from the test base to cancel payments on them. Imagine if I used INT, I wouldn’t have a way to find out from which database the PaymentID = 1000 request came from, for example.

So how did this save millions? Just ... multiply thousands of requests for cancellation fees ($ 20-30). And three more, since this mistake was repeated three times!

Are there any drawbacks to the GUID?


In short, yes, there is. However, they are so insignificant that they can not change my opinion. The most obvious of them is to write SQL queries manually (when you need to find something).

SELECT * FROM ORDER WHERE ORDERID = 12

much easier than

SELECT * FROM ORDER WHERE ORDERGUID = '{45F57B42-38A4-46ce-A180-6DE0E7051178}'

Another disadvantage is a slight decrease in the performance of links built on the basis of gUID, compared to INT. But in my experience, even using tables with a multimillion number of rows, it never became a problem. A few milliseconds of delay is a small price for all the delights of a GUID.

Try this technique in a small project, especially if you are still skeptical. I think it will be more useful than you could ever want.

- Published Sunday, November 04, 2007 10:35 PM by wwright
Filed under: .NET, ASP.NET, Tips and Tricks, LINQ, SQL Server

Source: https://habr.com/ru/post/31632/


All Articles