Friday, June 6, 2014

What is a Database?




One would think that the word “database” is an easy, straight forward word that as a native English speaker would not need an explanation. It is not an easy word to explain.  If I hold up an object to a native speaker, they should be able to tell me what it is.  I hold up the object below and a child in France would say it is a “livre;” a German would say, “Buch;” a Latin speaker “liber;” a Hebrew speaker “sefer.”  However, to a Yiddish speaker what it is called depends on the content. The answer could be, “buch” or “sefer.”
 


Suppose a reader came into the library and wanted a book.  The library has 1000’s of books.  You would ask him/her to be more precise.  When you find a book in the catalog the call number will tell you where to find it. The reader starts with a book describing the kind of object wanted and eventually finds the precise items needed. A similar thought process could help a hungry person find food in a grocery store. 

Abstract concepts are harder to define.  The meaning of “data” is moving target.  One person’s data is the basis for another person’s information.  One “datum[1]” is the smallest unit that represents objects, events, or entitles that have meaning in the user’s universe. The last part of this definition is where the context and flexibility make the exact definition imprecise.  In the computer language 0’s and 1’s are bits and are used for bytes.  Each byte represents a letter of the alphabet. Letters make words; words make sentences, sentences make paragraphs, and so forth.  A small example of a database is a sentence.  The characters (data) are organized into words and related to give meaning. A database is an organized collection of data. The database may be on paper, electronic or stored in any other media one could imagine.  A directory, an encyclopedia, or a card catalog may be considered databases.

In March a librarian asked about the difference between updating a database and updating a website.  At first I was going to dismiss the question as being too naïve for me to bother, but then I read an article by Denis Pombriant in CRM Magazine [2] “Data versus knowledge”  Pombriant talks about using data as a way to gain knowledge.  This flow of knowledge is a concept that I have long talked about.   He talks about data points that are not quantitative.   For example shoes can have size, color, style but also shipping records, sales records, etc.  Taken with the data from other sales, the business can gain the knowledge to make informed business decisions. Knowledge is a property of the human mind, but information and data are in constant motion.  He concludes that one must cultivate (in other words “organize”) data in order to turn it into information. 

If we follow this line of reasoning, data is the source that when it is once organized and stored may become information.  There is no definition of data that can fit every situation. 


Databases by their very nature are meant to by dynamic and always changing.  Paper databases, of course, change a lot slower than electronic databases. 

Returning to the original question about the difference between updating a website and updating a website, one has to figure out the type of entity one is dealing with. There are two kinds of web pages – dynamic and static.  Dynamic web pages are formed with data from many sources.  Every time one visits the site, it is different.  A dynamic database could be a portal to more information or display information based on a search of a database.  For example a web based email program will display the mail with the featured offered by the programmers.  The display changes based on messages that come and go.  A library catalog or a retail business site are examples or web sites that search databases.  A static page is coded by the creator and will display the same way until the creator changes it.

Updating a database is independent of the display of data on the screen.  The data could be displayed on multiple interfaces.  Many libraries have multiple search options for their catalogs.  For a static page, only an authorized user may update it.




[1] Just a reminder -- “data” is from Latin and is plural.  The singular is “dataum.”  However, in common usage “data” is used as singular noun.

[2] CRM means customer relations management.  This is publication aimed at helping business becoming more tuned in to the needs of customers.  Full citation: Pombriant, Denis.  “Data versus knowledge: gaining insight from your data means rethinking its definition.”  CRM Magazine April 2014. Online: http://www.destinationcrm.com/Articles/Columns-Departments/Reality-Check/Data-Versus-Knowledge-95253.aspx  

1 comment:

MrZed said...

Well, yes, but saying you should think about it and that it is fairly involved is not in itself useful. When I started, there were databases, but there was not yet a SQL. It seemed to encompass a collection of files which seemed easier to harness to a task than to maintain or modify. Around the word, there are a number of expectations that have grown up around it. There have been a number of solutions to satisfy these needs, quite distinct from one another. The finest of them thumb their noses at definitions and just do something vital to the code that utilizes them. The code was factored out of programs so the programmers could stop reinventing the wheel imperfectly every time they needed one. So a class of products grew up and a class of operators who can realize their potential. Conceiving of key loose files hanging around a system as something with greater potential was a leap.

All this is of little more use to understanding anything.