+Besides creating torrents and seeding, a content ingestion job also needs to transfer locally raw video files from the web server, to transcode them to destination formats, to extract image thumbnails and to transfer back torrent files and image thumbnails to the web server. For this purpose all this features use a pluggable interface. In \texttt{api} package, file \texttt{base.py} contains this interfaces detailed below:
+
+\begin{itemize}
+ \item \textbf{\texttt{BaseFileTransferer}} abstracts file transfers between CIS and web server.
+ \item \textbf{\texttt{BaseTranscoder}} abstracts video asset transcoding, so raw video files can be transcoded to destination formats by using an implementation of this base class.
+ \item \textbf{\texttt{BaseThumbExtractor}} abstracts image thumbnails extraction from video assets. Multiple extraction policies can be provided. A thumbnail can be extracted from a specific position from the video given in seconds, from random a position or summary thumbnails can be extracted, where a series of thumbnails by taking several snapshots are captured.
+ \item \textbf{\texttt{BaseAVInfo}} abstracts a tool for retrieving information about a video asset. Currently this interface can be used to get the duration of a video.
+\end{itemize}
+
+Developers can extend this interface classes with their own implementation. In CIS configuration file (\texttt{config.py}) system administrators can choose between several alternative implementations of a feature. Currently file transfers are done with FTP (File Transfer Protocol) \cite{ftp} with class \texttt{FTPFileTransferer} which extends \texttt{BaseFileTransferer}. Standard FTP Python libraries are used for this purpose. FTP uses two ports for communication between client and server, one corresponding to the control channel, the other one to the data channel. The control channel could transfer sensitive information, thus TLS (Transport Layer Security) \cite{tls} is used for encryption. Developers are encouraged to implement other alternative protocols for file transfers. For example, we are planning an rsync \cite{rsync} implementation which recovers better from failures and checks data integrity. Despite the fact that FTP is less advanced than rsync, we decided to implement it because of its popularity.
+
+Our \texttt{BaseTranscoder} implementation, \texttt{FFmpegTranscoder}, uses \textit{FFmpeg} \cite{ffmpeg} a software solution for recording, converting and streaming audio and video. Transconding is possible between almost any formats, by using third-party libraries. Being written in C, CIS forks a new process when using FFmpeg, and creates a pipe for monitoring standard output and standard error. Porting P2P-Tube to Windows should be easy because FFmpeg also works on this operating system.
+
+FFmpeg can also be used for extracting thumbnails, so we used the same software for implementing \texttt{FFmpegThumbExtractor}, which extends \texttt{BaseThumbExtractor}. Information about a video asset can be obtained with \texttt{FFprobeAVInfo}, an implementation of \texttt{BaseAVInfo} base class, which uses ffprobe an utility from FFmpeg software package.
+
+\textbf{CIS-LB} (CIS -- Load Balancer) usually lays on the same machine with a web server and also communicates with it via web services. Thus, we implemented CIS-LB in Python, using Web.py as CIS. Messages received from a web server are forwarded to a CIS from the pool by using a load balancing policy. Policies are implemented as pluggable interfaces exactly as features of CIS are. Three possible policies can be used:
+
+\begin{itemize}
+ \item \textbf{Random}: choose a random CIS from the pool and forward the request from web server to it.
+ \item \textbf{Optimum}: send a get load request to each CIS, calculate which one is the least loaded and forward the request from web server to it.
+ \item \textbf{Randomized Suboptimal}: choose $k$ random CIS machines, send a get load request to each one, calculate which one is the least loaded and forward the request from the web server to it.
+\end{itemize}
+
+The \textit{optimum policy} is slower because it must wait until all CIS machines respond to get load request, so although it finds the best solution it does not scale for systems with a big number of CIS machines. This policy is recommended for small scale systems. The \textit{random policy} is the fastest, but it doesn't found the best solution, but it loads each CIS machine with equal probability, providing a good load balancing. It is recommended for big systems. \textit{Randomized suboptimal policy} is a compromise between optimum policy and random policy and is recommended for medium sized systems. The choice between the three policies not only depends on the number of CIS machines from the system, but also on the size of the raw video files that need to be processed.
+
+\subsection{Front End and Back End Communication}
+\label{subsec:communication}
+
+The web server needs to communicate with Content Ingestion Server, whether or not through a CIS Load Balancer (CIS-LB), by sending a content ingestion request. By using a web service for this communication implementation overhead is reduced. Sending HTTP requests from the PHP web server application is much more easy then creating a custom new protocol. The same applies at the other communication point, the Python server, where requests are received by Web.py framework, which makes HTTP methods processing extremely easy.
+
+If another video platform decides to use our solution based on CIS machines, interoperability is simplified by using web service interfaces, no matter what programming language is used for web server application.
+
+\subsection{The Choice for the Web Service Type}
+\label{subsec:web-service}
+
+Our choice between a SOAP web service and a simple RESTful web service was based on our needs. We wanted to make a server with a low communication overhead and SOAP has the disadvantage of consuming more computational resources when processing requests. Our messages that need to be passed through different services have a simple structure.
+
+The \textit{get load request} does not have any parameters, so a simple HTTP GET request is sufficient and any extra data transmitted as XML with SOAP is redundant.
+
+\textit{Content ingestion request} is a message with a greater complexity. The name of the uploaded file located on the web server needs to be transmitted, along with video formats information such as containers, codecs used, resolutions, frame rates, aspect ratios, audio sampling rates and bit rates. All these information would fit well as parameters in a SOAP message. However encoding them in a JSON seemed to be a much simpler solution. Both PHP and Python offer functions that convert their primitive types and data structures, like lists and dictionaries, into JSON strings. XML messages have a greater verbosity comparing to JSON messages. Features like XML tag attributes are not required for our application.
+
+We expect web servers and their CIS Load Balancers to know CIS machines in advance. So there is no need for discovery services that could provide contact information for new CIS machines. This would reduce administrative control and would raise security concerns like discovering malicious CIS machines. So there is no need for service discovery features like UDDI from SOAP ecosystem.
+
+Services functionality does not need to be described because it is expected to be known in advance by the client application, so SOAP's WSDL feature is not needed.