+As stated before the default user interface of the platform provides a search box in the page header along with an options list for narrowing results to a specific category.
+
+The search functionality depends on MySQL Full-Text functions \cite{mysql-fulltext}, thus the search query language is similar. Two modes of full-text searching are used from MySQL: natural language full-text search and boolean full-text search.
+
+\textit{Natural language full-text search} allows users to enter space separated keywords that are matched against table rows. Searching is performed only in table columns that are indexed for full-text search, so we are indexing \texttt{title}, \texttt{description} and \texttt{tags} columns from \texttt{videos} table. What MySQL calls natural language is in fact vector space search, which calculates the relevance of each result row against the query with some variation of \textit{tf-idf} formula \cite{tf-idf}, one of the most popular used in information retrieval. This kind of searching performs well on our video set of about 120 items. However, natural language search mode has some problems. For instance, any word shorter than four characters is not indexed, so a query like \textit{``let it be''} or \textit{``git c repo''} wouldn't retrieve anything. Also, wild cards, quotations and boolean operators like \textit{and}, \textit{or}, \textit{not} are not supported.
+
+To handle this problem P2P-Tube's search feature also uses MySQL's \textit{boolean full-text search} functions. This mode allows users to enter special characters at the beginning of the keyword from the search query. For example $+$ and $-$ operators indicate if a word is required to be present or absent, respectively, in the results. A full reference of the operators supported can be found in MySQL documentation \cite{mysql-boolean-search}. Boolean search mode is only used when at least one of this operators is present in the search query, otherwise natural search mode is used. Boolean search mode offers a little bit more flexibility to the user, but on the other hand it has a big disadvantage, results are not ordered by their relevance. Instead of assigning a relevance score for each row, this mode adds one for each match of a column and zero otherwise. We weighted columns differently to mark their importance. We mark \texttt{title} as the most important column of the \texttt{video} table with respect to searching, by weighting it with 50\% of the total weight. \texttt{tags} column gets 30\% and description gets 20\%. So if a query matches against the title, 0.5 instead of 1 is added to the relevance, if tags matches, 0.3 is added and if description matches, 0.2 is added.
+
+Like natural language mode, boolean mode also skips words with less than 4 characters, so in our implementation we also search for word fragments in the indexed columns. To support this we use the SQL keyword \texttt{LIKE} along with strings of words containing character \texttt{\%} at the beginning and at the end. If such fragments are matched they affect very little the relevance because the weights of matching columns \texttt{title}, \texttt{tags} and \texttt{description} are very small, being 25\%, 15\% and 10\%, respectively. So, a query string \textit{``let it be''} will match a Beatles' video with the same name, but will also match videos containing \textit{``letter''}, because \textit{``let''} from the query is a fragment of \textit{``letter''}.
+
+For our small video set the search feature performs well without any performance issues. For future work we are planning to use more advanced search tools such as Lucene or Solr from Apache Software Foundation.
+