X-Git-Url: http://p2p-next.cs.pub.ro/gitweb/?a=blobdiff_plain;f=doc%2Fsrc%2Fdesign.tex;h=92e0fda97961888f3f68ae21cc432cde9b81f577;hb=ff860ca1651625fa3c6eee06796e7b341a4d66fa;hp=186fa80a19516c9e51b5ee5bddd0c5d0c6e04232;hpb=a8b0cb2d31f3e36b3c845c1e5c6f86161b3925df;p=p2p-kernel-protocol.git diff --git a/doc/src/design.tex b/doc/src/design.tex index 186fa80..92e0fda 100644 --- a/doc/src/design.tex +++ b/doc/src/design.tex @@ -1,12 +1,109 @@ \section{Design} \label{sec:design} -How this implementation is seen in the userspace. +Nowadays the vast majority of the peer-to-peer applications run in user space. +And this is one of the most important delays that appear at the Operating +System layer, while file transfers occurs. For every file operation such as +\texttt{open}, \texttt{read}, \texttt{write}, etc, as well as socket +operations, a system call is made. This induces a huge overhead caused by the +system calls interrupts and context switches. + +In our design we have tried to eliminate as much system calls overhead as +possible, by doing as many operations as possible directly in the kernel +space. Having this in mind, we have outlined the module's architecture as +illustrated in Figure \ref{fig:arch} and described in Section +\ref{subsec:arch}. + +Our model also changes the perspective of a user space programmer. Until now, +in order to simulate our module's behavior, the sender would have to open the +file it has to send, read data from it and then write the data through the +socket. Similar, a receiver has to open the file for writing, read data from +the network socket and write it down into the file. All these operations have +to be done repetitively in order to ensure that the file was properly +sent/received. And therefore a huge overhead could be avoided if all these +operations would be implemented directly in kernel, and would be substituted +with a single system call. + +Our approach eliminates almost all the \texttt{open}, \texttt{read}, +\texttt{write} system calls, replacing them with a single +\texttt{write}/\texttt{read} operation. The only information that has to be sent +from user space into kernel space is, for the sender, the full path of the file +that must be transferred and the IP address of the receiver, or receivers in +case there are more than one. In the receivers' case, the system call can be +used to get information about the transfer, like the files that are transferred +or the status of the operation. + +\begin{figure}[h!] + \centering + \includegraphics[scale=0.5]{img/architecture.png} + \label{fig:arch} + \caption{P2PKP Module Architecture} +\end{figure} + +Our implementation uses datagrams to send and receive information over +the network. Although the UDP protocol is simple and the generated overhead is +quite small, the protocol is not connection oriented which can lead to loss of +packages/data. However, this matter is beyond our interests in this research, +as we are now focusing on developing an efficient solution in terms of +latency, without considering network issues. + +\subsection{P2PKP Architecture} +\label{subsec:arch} + +The architecture of our protocol implementation resides completely in kernel +space, in shape of a kernel module. As Figure \ref{fig:arch} illustrates, the +communication between user space and our module is done completely through the +\texttt{read}, \texttt{write}, \texttt{connect} system calls. However, our +module is not a standalone entity, but it interacts with the Linux VFS +(Virtual File System) and NET layer in order to read/write files in kernel +space and deliver/receive them through the network. Section +\ref{sec:implementation} describes the interactions between all these components +and how they are linked together in order to provide the desired behavior. \subsection{P2PKP Sender} \label{subsec:sender} -How files are sent. + +The sender module uses the UNIX socket interface in order to communicate from +user space to kernel space. The system calls used for a receiver scenario are +\texttt{connect} and \texttt{write}. When an user wants to transfer a file to +one or more peers, the module needs only the local filename and the set of +peers address. The \texttt{connect} system call is used to specify +destinations of a single file. Multiple destinations can be specified by using +multiple \texttt{connect} operations. After all the destinations are specified +from user space to kernel space, a \texttt{write} operation must be used, in +order to send the file over the network to all specified peers. The name of the +file, represents the absolute path of the file which will be transferred to the +peers. After this operations are sent from user space to kernel space, the +whole file transfer will occur in kernel space, hence no additional system calls +must be made. + +For each \texttt{connect} system call, the kernel module creates a new socket +for each socket specified for communication. After the \texttt{write} system +call, the kernel module tries to open the file. If the operation succeeds, the +module start reading data from the file, and writing into to the sockets created +after the \texttt{connect} operations. After the whole content of the file is +transferred each socket, the file descriptor and the sockets are closed. + +Every socket created in the user space is designed to transfer one or more files +to one or multiple destinations. The total number of system calls used to +transfer the files to each peer is equal to the number of files transferred plus +the total number of recipients. \subsection{P2PKP Received} \label{subsec:receiver} -How files are received. + +The receiver part of the module is also implemented in kernel module. The +communication between the user space and the module is realised using the +kernel socket interface. The \texttt{bind} system call is used to set the +interface and the port that the module will listen on for incoming requests. +The \texttt{read} operation is used in order to specify the file where the +incoming data is stored. The file has to be specified using the global path, +rather than relative name. Currently, our implementation stores the whole data +in a single file. We haven't advanced in more complex scenarios, as our +initial goal was only to see if this design is good enough to fight with other +implementations, like \texttt{sendfile}. + +After the \texttt{bind} operation, the module starts listening for data. As +pointed in the previous paragraph, the whole data is transferred in a single +file specified by the \texttt{read} system call. Therefore, summing up all the +system calls needed to receive a single file over the network reduces to two.