[ Science ]

What is Information Retrieval – Brief History on Information Retrieval

When we think of information retrieval, the first thing that often comes to mind is a search engine. This would be correct, although search engines are the pinnacle of information retrieval at the moment, even though they have been in development for over 27 years.

But, information retrieval started as an idea in the 1930s, one which later came to life. Here is a brief history of information retrieval.

The Inception of Information Retrieval

Information retrieval, or rather machines which were able to fetch some information were first heard of in 1948, and Holmstrom described the first one, called a Univac. This machine was able to record specific symbols on a magnetic steel tape and fetch a document under those symbols, then retype its content.

Automated systems were already introduced not two years later, in 1950, and by the end of the 50s, one was already in a movie, called Desk Set in 1957. By the 70s, information retrieval systems could already perform well with several thousand documents. Things took a rapid change after 1992, however, after the Text Retrieval Conference, sponsored by the US Department of Defense and the National Institute of Standards and Technology.

Information Retrieval Model Types

There are two primary dimensions to information retrieval, the first dimension, known as the mathematical basis and the second dimension, known as the properties of the model.

The first dimension has 4 models which are often used.

The set type uses set-theoretic models and it represents documents as words or phrases.

The algebraic model uses vectors, matrices and tuples to represent data.

The probabilistic model finds similarities and uses probability to determine whether a text is relevant to a search.

The feature-based retrieval models are the most advanced and use features or rather vectors of values to find the best way of combining these features into a result, one that is also ranked. This model is by far the most advanced one as it can use all the other models, simply by viewing them as features.

The second dimension deals with the properties of the model, which can be without term-interdependencies and with term-interdependencies.

Models without term-interdependencies treat words as different and separate entities, in other words, independent.

Models with transcendent term-interdependencies allow interdependencies between terms, but they do not specify them. You would have to input these differences yourself, think of advanced options in a search engine.

Models with immanent term-interdependencies allow interdependencies and they also specify them.

Some search engines use predictive algorithms to find the thing you are looking for. They can be based on different types of data, whether the most popular in the world, the most popular in your region, or connected to your own previous searches. Google’s search engine is great in this regard. Searching for a movie, then its main character should prompt the next search to autofill your query to anything associated with that movie.

Information retrieval has still room to grow, however, as web search engines can only get better.