Home » Miscellanea » Information Retrieval

Information Retrieval

In this section, I will explore some fundamental topics about Information Retrieval (IR) and its related subjects.

I will use the following notation throughout all this section:
D = \{d_1,\ldots,d_m\} is a collection of (text) documents;
V = \{t_1,\ldots,t_n\} is the dictionary of unique terms as extracted from D (a.k.a. vocabulary or lexicon);
q is a query used to ask the retrieval system for those documents in D that are relevant to q. (Note that here we are not making any assumption on what does it really mean for a document to be relevant to a query);
D_q\subseteq D is the collection of results (i.e., documents) returned by the IR system, which supposes are relevant to the query q. This may be either an unranked set (i.e., where there’s no order relation between any pair of retrieved documents) or a ranked list (i.e., retrieved documents are scored and sorted according to their relevance to the query).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s