From: Victor Grishchenko Date: Tue, 15 Dec 2009 13:53:26 +0000 (+0100) Subject: libswift.org X-Git-Url: http://p2p-next.cs.pub.ro/gitweb/?a=commitdiff_plain;h=34c0fe8ed593ce0f5b3de71d1a4130a382277ccd;p=swift-upb.git libswift.org --- diff --git a/doc/apusapus.png b/doc/apusapus.png new file mode 100644 index 0000000..77a6a26 Binary files /dev/null and b/doc/apusapus.png differ diff --git a/doc/cloud4.jpg b/doc/cloud4.jpg new file mode 100644 index 0000000..1164ec5 Binary files /dev/null and b/doc/cloud4.jpg differ diff --git a/doc/index.html b/doc/index.html new file mode 100644 index 0000000..e9bcf62 --- /dev/null +++ b/doc/index.html @@ -0,0 +1,272 @@ + + + + + + swift: the multiparty transport protocol + + + + + + +
+ +

swift: the multiparty transport protocol

+ +
Turn the Net into a single data Cloud!
+ +

Ideas

+

As + wise people say, the Internet was initially built for + remotely connecting scientists to expensive supercomputers (whose computing + power was comparable to modern cell phones). Thus, they supported the abstraction + of conversation. Currently, the Internet is mostly used for disseminating + content to the masses, which mismatch definitely creates some problems.

+ +

The swift protocol is a content-centric multiparty transport + protocol. Basically, it answers one and only one question: "Here is a + hash! + Give me data for it!" A bit oversimplifying, it might be understood as + BitTorrent + at the + transport layer. Ultimately, it aims at the abstraction of the Internet + as a single big data + Cloud. Such entities as storage, servers and connections are abstracted + away and are virtually invisible at the API layer. The data is received + from whatever source available and data integrity is checked + cryptographically with + Merkle hash trees.

+ +

An old Unix adage says: "free memory is wasted memory". Once a + computer is powered on, there is no benefit in keeping some memory + unoccupied. We may extend this principle a bit further: +

    +
  • free bandwidth is wasted bandwidth; +
  • free storage is wasted storage. +
+ Unless your power budget is really tight, there is no sense in conserving + either. Thus, instead of putting emphasis on reciprocity and incentives + we focus on ligther-footprint code, + non-intrusive + + congestion control and automatic disk space management. +

+ +

Currently, most parts of the protocol/library are implemented, pass + basic testing and successfully transfer data on real networks. + After more scrutinized testing, the protocol and the library are expected + to be real-life-ready in March'10. +

+
+ +

Design of the protocol

+ +

Most features of the protocol are defined by its function of + content-centric multiparty transport protocol. It entirely drops the TCP's + abstraction of sequential reliable data stream delivery as it is redundant in + our case. For example, out-of-order data could still be saved and the same + piece of data might be always received from another peer. + Being implemented over UDP, the protocol does its best to make + every datagram self-contained. + In general, cutting of unneeded functions and aggressive layer collapsing + greatly simplified the protocol, compared to e.g. the BitTorrent+TCP stack.

+ +

Atomic datagrams, not data stream

+

To achieve per-datagram flexibility of data flow and also to adapt to + the unreliable medium (UDP, and, ultimately, IP), the protocol was built + around the abstraction of atomic datagrams. + Ideally, once received, a datagram is either + immediately discarded or permanently accepted, ready to be forwarded to + other peers. + For the sake of flexibility, most of the protocol's messages + are optional. It also has no "standard" header. Instead, each datagram + is a concatenation of zero or more messages. No message ever spans two + datagrams. Except for the data pieces themselves, no message is + acknowledged, as thus, not guaranteed to be delivered.

+
+ +

Scale-independent unit system

+

To avoid a multilayered request/acknowledgement system, when every + next layer does basically the same, but for bigger chunks of data, as + it is the case with BitTorrent+TCP packet-block-piece-file-torrent + stacking, swift employs a scale-independent acknowledgement/ + request system, where data is measured by aligned power-of-2 intervals + (so called bins). All acknowledgements and requests are done in terms + of bins.

+
+ +

Datagram-level integrity checks

+

swift builds Merkle hash trees down to every single packet + (1 kilobyte of data). Once data is transmitted, all uncle hashes + necessary for verification are prepended to the same datagram. + As the receiver constantly remembers old hashes, the average number + of "new" hashes, which have to be transmitted is small, + normally around 1 per packet of data.

+
+ +

NAT traversal by design

+

The only method of peer discovery in swift is + PEX: + a third peer initiates a connection between two of its contacted peers. + The protocol's handshake is engineered to perform simple NAT hole + punching transparently, in case that is needed. +

+
+ +

Push AND pull

+

The protocol allows both for PUSH (sender decides what to send) and + PULL (receiver explicitly requests the data). PUSH is normally used as + a fallback if PULL fails; also, the sender may ignore requests and send + any data it finds convenient to send. Merkle hash trees allow this + flexibility without causing security implications.

+
+ +

No transmission metadata

+

+
+
+ +

Specifications and documentation

+

+

+

+
+ +

The code

+ +
+ +

Frequently asked questions

+ +

Well, why "swift"?

+

That name served well to many other protocols; we hope it will serve well to ours. + You may count it as a meta-joke. The original name for the protocol was "VicTorrent". + We also insist on lowercase italic "swift" to keep the name formally unique + (for some definition of unique).

+
+ +

How is it different from...

+ +

...TCP?

+

TCP emulates reliable in-order delivery ("data pipe") over + chaotic unreliable multi-hop networks. TCP has no idea what data it is dealing + with, as the data is passed from the userspace. + In our case, the data is fixed in advance and many peers participate in + distributing the same data. Orderness of delivery is of little importance and + unreliability is naturally compensated by redundance. + Thus, many functions of TCP turn out to be redundant. + The only function of TCP that is also critical for swift is the congestion + control, but... we need our own custom congestion control! Thus, we did not use + TCP.

+

That led both to hurdles and to some savings. As one example, every TCP + connection needs to maintain buffers for the data that has left the sender's userspace, but + did not yet arrive at the receiver's userspace. As we know that we are dealing + with the same fixed data, we don't need to maintain per-connection buffers. +

+
+ +

...UDP?

+

UDP, which is the thinniest wrapper around + IP, is our choise of underlying protocol. From the standpoint of ideology, a + transport protocol should be implemented over IP, but unfortunately that causes + some chicken-and-egg problems, like a need to get into the kernel to get deployments, + and a need to get deployments to be accepted into the kernel. + UDP is also quite nice in regard to NAT penetration.

+
+ +

...BitTorrent?

+

BitTorrent is an application-level protocol and quite a heavy one. We focused + on fitting our protocol into the restrictions of the transport layer, assuming + that the protocol might eventually be included into operating system kernels. + For example, we stripped the protocol of any transmission's metadata (the + .torrent file); leaving a file's root hash as the only parameter.

+
+ +

...µTorrent's µTP?

+

Historically, BitTorrent gained lots of adaptations to its underlying + transport. First and foremost, TCP is unable to prioritize traffic, so BitTorrent + needed to coerce users somehow to tolerate inconviniences of seeding. That caused + tit4tat and, to significant degree, rarest-first. Another example is the 4 + upload slots limitation. (Well, apparently, some architectural decisions in + BitTorrent were dictated by oddities Windows 95, but... never mind.) + Eventually, BitTorrent developers came to the conclusion that not annoying + the user in the first place is probably a better stimulus. So they came up with + the LEDBAT + congestion control algorithm (Low Extra Delay Background Transport). + LEDBAT allows a peer to seed without interfering with the regular traffic + (in plain words, without slowing down the browser). + To integrate the novel congestion control algorithm into BitTorrent incrementally, + BitTorrent Inc had to develop TCP-alike transport named + µTP. + The swift project (then named VicTorrent) was started when we tried to understand + what happens if we'll strip BitTorrent of any Win95-specific, TCP-specific + or Python-specific workarounds. As it turns out, not much was left. +

+
+ +

...Van Jacobson's CCN?

+

Van Jacobson's + team in PARC is doing exploratory research on + content-centric + networking. While BitTorrent works at layer 5 (application), we go to layer 4 (transport), + PARC people are bold enough to go to layer 3 and to propose a complete replacement + for the entire TCP/IP world. That is certainly a compelling vision, but we focus + on near future (<10 years); while CCNx is a + much more ambitious rework.

+
+ +

...DCCP?

+

This question arises quite frequently as DCCP is a congestion-controlled datagram + transport. The option of implementing swift over + DCCP was considered, + but the inconvinience of working with an esoteric transport was not compensated by the + added value of DCCP, which is limited to one mode of congestion control being readily + implemented. + Architectural restrictions imposed by DCCP were also found to be a major inconvenience. + Last but not least, currently only Linux supports DCCP at the kernel level.

+
+ +

...SCTP?

+

+
+ +
+ +

Who we are

+

+

+

+
+ +

Contacts&feedback

+

mail us

+

subscribe to a mailing list

+
+ + + \ No newline at end of file diff --git a/doc/style.css b/doc/style.css new file mode 100644 index 0000000..d8162a9 --- /dev/null +++ b/doc/style.css @@ -0,0 +1,61 @@ +body { + background: #fefeff url("cloud4.jpg") fixed no-repeat; + #background-size: 100%; # + #-moz-background-size: 100% 100%; /* Gecko 1.9.2 (Firefox 3.6) */ + #-o-background-size: 100% 100%; /* Opera 9.5 */ + #-webkit-background-size: 100% 100%; /* Safari 3.0 */ + #-khtml-background-size: 100% 100%; /* Konqueror 3.5.4 */ +} + +h1, h2, h3 { + font-family: Verdana; +} + +p { + text-align: justify; +} + +body > div { + width: 60%; + margin: auto; + margin-top: 64px; + margin-bottom: 64px; + #background: #d0e0ff; + background: rgba(208,224,255,0.8); + padding-top: 16px; + padding-bottom: 16px; +} + +img#logo { + #display: block; + #margin-left: auto; + #margin-right: auto; + #position: relative; + #top: -40px; + position:absolute; + top: 4px; +} + +body > div > h1 { + text-align:center; +} + +body > div > div { + width: 90%; + margin: auto; +} + +div#motto { + text-align: right; + font-style: italic; + font-size: larger; +} + +div.fold>h2, div.fold>h3, div.fold>h4 { + text-decoration: underline; + cursor: pointer; +} + +div.fold > p, div.fold > ul, div.fold > div { + display: none; +} diff --git a/serp++.1 b/serp++.1 deleted file mode 100644 index 06b9ab9..0000000 --- a/serp++.1 +++ /dev/null @@ -1,79 +0,0 @@ -.\"Modified from man(1) of FreeBSD, the NetBSD mdoc.template, and mdoc.samples. -.\"See Also: -.\"man mdoc.samples for a complete listing of options -.\"man mdoc for the short list of editing options -.\"/usr/share/misc/mdoc.template -.Dd 3/6/09 \" DATE -.Dt serp++ 1 \" Program name and manual section number -.Os Darwin -.Sh NAME \" Section Header - required - don't modify -.Nm serp++, -.\" The following lines are read in generating the apropos(man -k) database. Use only key -.\" words here as the database is built based on the words here and in the .ND line. -.Nm Other_name_for_same_program(), -.Nm Yet another name for the same program. -.\" Use .Nm macro to designate other names for the documented program. -.Nd This line parsed for whatis database. -.Sh SYNOPSIS \" Section Header - required - don't modify -.Nm -.Op Fl abcd \" [-abcd] -.Op Fl a Ar path \" [-a path] -.Op Ar file \" [file] -.Op Ar \" [file ...] -.Ar arg0 \" Underlined argument - use .Ar anywhere to underline -arg2 ... \" Arguments -.Sh DESCRIPTION \" Section Header - required - don't modify -Use the .Nm macro to refer to your program throughout the man page like such: -.Nm -Underlining is accomplished with the .Ar macro like this: -.Ar underlined text . -.Pp \" Inserts a space -A list of items with descriptions: -.Bl -tag -width -indent \" Begins a tagged list -.It item a \" Each item preceded by .It macro -Description of item a -.It item b -Description of item b -.El \" Ends the list -.Pp -A list of flags and their descriptions: -.Bl -tag -width -indent \" Differs from above in tag removed -.It Fl a \"-a flag as a list item -Description of -a flag -.It Fl b -Description of -b flag -.El \" Ends the list -.Pp -.\" .Sh ENVIRONMENT \" May not be needed -.\" .Bl -tag -width "ENV_VAR_1" -indent \" ENV_VAR_1 is width of the string ENV_VAR_1 -.\" .It Ev ENV_VAR_1 -.\" Description of ENV_VAR_1 -.\" .It Ev ENV_VAR_2 -.\" Description of ENV_VAR_2 -.\" .El -.Sh FILES \" File used or created by the topic of the man page -.Bl -tag -width "/Users/joeuser/Library/really_long_file_name" -compact -.It Pa /usr/share/file_name -FILE_1 description -.It Pa /Users/joeuser/Library/really_long_file_name -FILE_2 description -.El \" Ends the list -.\" .Sh DIAGNOSTICS \" May not be needed -.\" .Bl -diag -.\" .It Diagnostic Tag -.\" Diagnostic informtion here. -.\" .It Diagnostic Tag -.\" Diagnostic informtion here. -.\" .El -.Sh SEE ALSO -.\" List links in ascending order by section, alphabetically within a section. -.\" Please do not reference files that do not exist without filing a bug report -.Xr a 1 , -.Xr b 1 , -.Xr c 1 , -.Xr a 2 , -.Xr b 2 , -.Xr a 3 , -.Xr b 3 -.\" .Sh BUGS \" Document known, unremedied bugs -.\" .Sh HISTORY \" Document history if command behaves in a unique manner \ No newline at end of file diff --git a/tests/filetest.cpp b/tests/filetest.cpp deleted file mode 100644 index 952bcf9..0000000 --- a/tests/filetest.cpp +++ /dev/null @@ -1,36 +0,0 @@ -#include "file.h" -#include - -TEST(FileTest,mmap) { - // open - // mmap - // unmap - // mmap - // read - // close - // open - // read fails - // mmap - // read - // close -} - -TEST(FileTest,retrieval) { - // create with a root hash - // supply with hashes and data - // check peak hashes - // one broken packet - // check history - // close - // verify -} - -TEST(FileTest,Streaming) { -} - -int main (int argc, char** argv) { - - testing::InitGoogleTest(&argc, argv); - return RUN_ALL_TESTS(); - -}