Monday, December 26, 2016

Vocabulary of the Search




Too many Internet users think that Google or other search engines are the best way to search for information. Yesterday a friend was asking questions about the history of the synagogue we were in.  I told him that my son wrote an article about a part of the history.  The article was not published in any source that he could check with Google.  He said that he was going to check with Google.  I tried to tell him that there is a good reason libraries pay big bucks for databases and Google makes money selling ads.  I even offered to send him the article if he would give me his e-mail address.



Today Google came through.  With my alerts I found out that he searched for the information.  He found the article (home.earthlink.net/~ddstuhlman/crc94.pdf)  that I wrote about helping my son write the article. He did not find the article my son wrote.

In 2003 I taught a beginning course in database searching.  Part of understanding the best search strategy is to understand how databases work.  Here is a list of vocabulary terms that are as relevant today as they were back in 2003.  I made only minor updates.

         Byte: The smallest unit able to transfer or store data in the computer memory or a file is a 0 or 1. Bytes are the building blocks for programs.  One byte can store one character.
          
         Character: In language the smallest unit of information is the alphabet.  The letters form phonemes (the smallest unit of sound), morphemes (the smallest grammatical unit) and eventually words.  Words form phrases and sentences.
          
         Field:  An identified element of a record that contains alpha or numeric data, e.g. title field or author field.
    
         Database:  A collection of data and/or information.
       
         File:  Collection of related information. In the computer files contain programs, text, or data.
    
         Record:  Unit of a file that contains all information regarding a particular item.
       
         Database producer:  A company that collects and organizes data, turns data into information, and creates machine-readable files.
        
         Vendor:  An organization that sells information access to institutions or consumers.
       
         Databank:  A group of databases that are vended by the same company.
       
         Information retrieval:  Making a given collection of stored information available to users who want access.
        
         End User: Person who does the search or the person needing the information.
          
         False Drop: A citation produced from a logically correct search that is not relevant to the user’s needs.
         
         Hits or Postings: Both terms are used to indicate the number of documents or citation reported.

         Information: Organized data that has been arranged for better comprehension or understanding. What is one person's information can become an other person's data. 

         Relevant:   Results that are useable, appropriate, or on topic.  A highly subjective term that only the user of the information can judge and even then may judge inconsistently. 
         
         Recall:  The number of retrieved relevant items out of all possible relevant items in a file
       
         Precision:  The number of relevant items retrieved out of the actual number of items retrieved.  Ideally, a good search has both good precision and recall, in reality you usually have to sacrifice one for the other.

Types of databases

         Reference or citation – Points users to the source of information.   Examples are bibliographic data bases such as library catalogs and indexes.
          
         Source or Full text – Contains the actual data or texts that the user wants. Examples are Ebsco academic databases, encyclopedias, ProQuest, and Nexis/Lexis.
          
         Directories: Provide access to names, addresses and related data. Examples are: Phone books,  American Library Directory

         Pictorial :  Provides access to graphic and other still visuals.  Examples are: ArtStor or map databases.

         Visual and/audio: Provide access to recorded moving images and recorded audio. Examples are Alexander Street Videos and PBS videos.

         Hybrid databases: Databases that can’t be classified in one of the categories or contain both full text and  citations. Examples are: Internet Archive and WestLaw.
          
Database players

      In commercial online searching there are usually 4 players:

         1. Database producer creates the database itself
       
         2. Database vendor processes and distributes the database to libraries, businesses, and organizations.
        
         3. A trained expert to help the end user learn about searching or to help guide the search
        
         4. End user the person who will actually use the search results
       
In some cases, the producer and vendor will be the same entity.

No comments: