BitTorrent
BitTorrent is a protocol supporting the practice of peer-to-peer file sharing that is used to distribute large amounts of data over the Internet.
BitTorrent is one of the most common protocols for transferring large
files, and peer-to-peer networks have been estimated to collectively
account for approximately 43% to 70% of all Internet traffic (depending on geographical location) as of February 2009. In November of 2004, BitTorrent was responsible for 35% of all Internet traffic. As of February 2013, BitTorrent was responsible for 3.35% of all
worldwide bandwidth, more than half of the 6% of total bandwidth
dedicated to file sharing.
Programmer Bram Cohen, a former University at Buffalo graduate student in Computer Science major, designed the protocol in April 2001 and released the first available version on July 2, 2001, and the final version in 2008. BitTorrent clients are available for a variety of computing platforms and operating systems including an official client released by Bittorrent, Inc.
As of 2009, BitTorrent reportedly had about the same number of active users online as viewers of YouTube and Facebook combined. As of January 2012, BitTorrent is utilized by 150 million active users (according to BitTorrent, Inc.). Based on this figure, the total number of monthly BitTorrent users can be estimated at more than a quarter of a billion.
Programmer Bram Cohen, a former University at Buffalo graduate student in Computer Science major, designed the protocol in April 2001 and released the first available version on July 2, 2001, and the final version in 2008. BitTorrent clients are available for a variety of computing platforms and operating systems including an official client released by Bittorrent, Inc.
As of 2009, BitTorrent reportedly had about the same number of active users online as viewers of YouTube and Facebook combined. As of January 2012, BitTorrent is utilized by 150 million active users (according to BitTorrent, Inc.). Based on this figure, the total number of monthly BitTorrent users can be estimated at more than a quarter of a billion.
Description
The BitTorrent protocol can be used to reduce the server and network impact of distributing large files. Rather than downloading a file from a single source server, the BitTorrent protocol allows users to join a "swarm" of hosts to download and upload from each other simultaneously. The protocol is an alternative to the older single source, multiple mirror sources technique for distributing data, and can work over networks with lower bandwidth. Using the BitTorrent protocol, several basic computers, such as home computers, can replace large servers while efficiently distributing files to many recipients. This lower bandwidth usage also helps prevent large spikes in internet traffic in a given area, keeping internet speeds higher for all users in general, regardless of whether or not they use the BitTorrent protocol.A user who wants to upload a file first creates a small torrent descriptor file that they distribute by conventional means (web, email, etc.). They then make the file itself available through a BitTorrent node acting as a seed. Those with the torrent descriptor file can give it to their own BitTorrent nodes which, acting as peers or leechers, download it by connecting to the seed and/or other peers .
The file being distributed is divided into segments called pieces. As each peer receives a new piece of the file it becomes a source (of that piece) for other peers, relieving the original seed from having to send that piece to every computer or user wishing a copy. With BitTorrent, the task of distributing the file is shared by those who want it; it is entirely possible for the seed to send only a single copy of the file itself and eventually distribute to an unlimited number of peers.
Each piece is protected by a cryptographic hash contained in the torrent descriptor. This ensures that any modification of the piece can be reliably detected, and thus prevents both accidental and malicious modifications of any of the pieces received at other nodes. If a node starts with an authentic copy of the torrent descriptor, it can verify the authenticity of the entire file it receives.
Pieces are typically downloaded non-sequentially and are rearranged into the correct order by the BitTorrent Client, which monitors which pieces it needs, and which pieces it has and can upload to other peers. Pieces are of the same size throughout a single download (for example a 10 MB file may be transmitted as ten 1 MB Pieces or as forty 256 KB Pieces). Due to the nature of this approach, the download of any file can be halted at any time and be resumed at a later date, without the loss of previously downloaded information, which in turn makes BitTorrent particularly useful in the transfer of larger files. This also enables the client to seek out readily available pieces and download them immediately, rather than halting the download and waiting for the next (and possibly unavailable) piece in line, which typically reduces the overall length of the download.
When a peer completely downloads a file, it becomes an additional seed. This eventual shift from peers to seeders determines the overall "health" of the file (as determined by the number of times a file is available in its complete form).
The distributed nature of BitTorrent can lead to a flood like spreading of a file throughout many peer computer nodes. As more peers join the swarm, the likelihood of a complete successful download by any particular node increases. Relative to traditional Internet distribution schemes, this permits a significant reduction in the original distributor's hardware and bandwidth resource costs.
Distributed downloading protocols in general provide redundancy against system problems, reduces dependence on the original distributor and provides sources for the file which are generally transient and therefore harder to trace by those who would block distribution compared to the situation provided by limiting availability of the file to a fixed host machine (or even several).
One such example of BitTorrent being used to reduce the distribution cost of file transmission is in the BOINC Client-Server system. If a BOINC distributed computing application needs to be updated (or merely sent to a user) it can do so with little impact on the BOINC Server.
Operation
A BitTorrent client is any program that implements the BitTorrent protocol. Each client is capable of preparing, requesting, and transmitting any type of computer file over a network, using the protocol. A peer is any computer running an instance of a client.To share a file or group of files, a peer first creates a small file called a "torrent" (e.g. MyFile.torrent). This file contains metadata about the files to be shared and about the tracker, the computer that coordinates the file distribution. Peers that want to download the file must first obtain a torrent file for it and connect to the specified tracker, which tells them from which other peers to download the pieces of the file.
Though both ultimately transfer files over a network, a BitTorrent download differs from a classic download (as is typical with an HTTP or FTP request, for example) in several fundamental ways:
- BitTorrent makes many small data requests over different TCP connections to different machines, while classic downloading is typically made via a single TCP connection to a single machine.
- BitTorrent downloads in a random or in a "rarest-first" approach that ensures high availability, while classic downloads are sequential.
In general, BitTorrent's non-contiguous download methods have prevented it from supporting progressive download or "streaming playback". However, comments made by Bram Cohen in January 2007 suggest that streaming torrent downloads will soon be commonplace and ad supported streaming appears to be the result of those comments. In January 2011 Cohen demonstrated an early version of BitTorrent streaming, saying the feature was projected to be available by summer 2011. As of 2013, this new BitTorrent streaming protocol is available for beta testing.
Creating and publishing torrents
The peer distributing a data file treats the file as a number of identically sized pieces, usually with byte sizes of a power of 2, and typically between 32 kB and 16 MB each. The peer creates a hash for each piece, using the SHA-1 hash function, and records it in the torrent file. Pieces with sizes greater than 512 kB will reduce the size of a torrent file for a very large payload, but is claimed to reduce the efficiency of the protocol. When another peer later receives a particular piece, the hash of the piece is compared to the recorded hash to test that the piece is error-free. Peers that provide a complete file are called seeders, and the peer providing the initial copy is called the initial seeder.The exact information contained in the torrent file depends on the version of the BitTorrent protocol. By convention, the name of a torrent file has the suffix
.torrent
. Torrent files have an "announce" section, which specifies the URL of the tracker, and an "info" section, containing (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, all of which are used by clients to verify the integrity of the data they receive.Torrent files are typically published on websites or elsewhere, and registered with at least one tracker. The tracker maintains lists of the clients currently participating in the torrent. Alternatively, in a trackerless system (decentralized tracking) every peer acts as a tracker. Azureus was the first BitTorrent client to implement such a system through the distributed hash table (DHT) method. An alternative and incompatible DHT system, known as Mainline DHT, was later developed and adopted by the BitTorrent (Mainline), µTorrent, Transmission, rTorrent, KTorrent, BitComet, and Deluge clients.
After the DHT was adopted, a "private" flag — analogous to the broadcast flag — was unofficially introduced, telling clients to restrict the use of decentralized tracking regardless of the user's desires. The flag is intentionally placed in the info section of the torrent so that it cannot be disabled or removed without changing the identity of the torrent. The purpose of the flag is to prevent torrents from being shared with clients that do not have access to the tracker. The flag was requested for inclusion in the official specification in August, 2008, but has not been accepted yet. Clients that have ignored the private flag were banned by many trackers, discouraging the practice.
Downloading torrents and sharing files
Users find a torrent of interest, by browsing the web or by other means, download it, and open it with a BitTorrent client. The client connects to the tracker(s) specified in the torrent file, from which it receives a list of peers currently transferring pieces of the file(s) specified in the torrent. The client connects to those peers to obtain the various pieces. If the swarm contains only the initial seeder, the client connects directly to it and begins to request pieces.Clients incorporate mechanisms to optimize their download and upload rates; for example they download pieces in a random order to increase the opportunity to exchange data, which is only possible if two peers have different pieces of the file.
The effectiveness of this data exchange depends largely on the policies that clients use to determine to whom to send data. Clients may prefer to send data to peers that send data back to them (a tit for tat scheme), which encourages fair trading. But strict policies often result in suboptimal situations, such as when newly joined peers are unable to receive any data because they don't have any pieces yet to trade themselves or when two peers with a good connection between them do not exchange data simply because neither of them takes the initiative. To counter these effects, the official BitTorrent client program uses a mechanism called "optimistic unchoking", whereby the client reserves a portion of its available bandwidth for sending pieces to random peers (not necessarily known good partners, so called preferred peers) in hopes of discovering even better partners and to ensure that newcomers get a chance to join the swarm.
Although swarming scales well to tolerate flash crowds for popular content, it is less useful for unpopular content. Peers arriving after the initial rush might find the content unavailable and need to wait for the arrival of a seed in order to complete their downloads. The seed arrival, in turn, may take long to happen (this is termed the seeder promotion problem). Since maintaining seeds for unpopular content entails high bandwidth and administrative costs, this runs counter to the goals of publishers that value BitTorrent as a cheap alternative to a client-server approach. This occurs on a huge scale; measurements have shown that 38% of all new torrents become unavailable within the first month. A strategy adopted by many publishers which significantly increases availability of unpopular content consists of bundling multiple files in a single swarm.More sophisticated solutions have also been proposed; generally, these use cross-torrent mechanisms through which multiple torrents can cooperate to better share content.
BitTorrent does not offer its users anonymity. It is possible to obtain the IP addresses of all current and possibly previous participants in a swarm from the tracker. This may expose users with insecure systems to attacks. It may also expose users to the risk of being sued, if they are distributing files without permission from the copyright holder(s). However, there are ways to promote anonymity; for example, the OneSwarm project layers privacy-preserving sharing mechanisms on top of the original BitTorrent protocol.
Adoption
A growing number of individuals and organizations are using BitTorrent to distribute their own or licensed material. Independent adopters report that without using BitTorrent technology, and its dramatically reduced demands on their private networking hardware and bandwidth, they could not afford to distribute their files.Film, video, and music
- BitTorrent Inc. has obtained a number of licenses from Hollywood studios for distributing popular content from their websites.
- Sub Pop Records releases tracks and videos via BitTorrent Inc. to distribute its 1000+ albums. Babyshambles and The Libertines (both bands associated with Pete Doherty) have extensively used torrents to distribute hundreds of demos and live videos. US industrial rock band Nine Inch Nails frequently distributes albums via BitTorrent.
- Podcasting software is starting to integrate BitTorrent to help podcasters deal with the download demands of their MP3 "radio" programs. Specifically, Juice and Miro (formerly known as Democracy Player) support automatic processing of .torrent files from RSS feeds. Similarly, some BitTorrent clients, such as µTorrent, are able to process web feeds and automatically download content found within them.
- DGM Live purchases are provided via BitTorrent.
- Vodo, a service which distributes "free-to-share" movies and TV shows via BitTorrent.
Broadcasters
- In 2008, the CBC became the first public broadcaster in North America to make a full show (Canada's Next Great Prime Minister) available for download using BitTorrent.
- The Norwegian Broadcasting Corporation (NRK) has since March 2008 experimented with bittorrent distribution, available online. Only selected material in which NRK owns all royalties are published. Responses have been very positive, and NRK is planning to offer more content.
- The Dutch VPRO broadcasting organization released four documentaries in 2009 and 2010 under a Creative Commons license using the content distribution feature of the Mininova tracker.
Personal material
- The Amazon S3 "Simple Storage Service" is a scalable Internet-based storage service with a simple web service interface, equipped with built-in BitTorrent support.
- Blog Torrent offers a simplified BitTorrent tracker to enable bloggers and non-technical users to host a tracker on their site. Blog Torrent also allows visitors to download a "stub" loader, which acts as a BitTorrent client to download the desired file, allowing users without BitTorrent software to use the protocol. This is similar to the concept of a self-extracting archive.
Software
- Blizzard Entertainment uses BitTorrent (via a proprietary client called the "Blizzard Downloader") to distribute content and patches for Diablo III, StarCraft II and World of Warcraft, including the games themselves.
- CCP Games, maker of the space Simulation MMORPG Eve Online, has announced that a new launcher will be released that is based on BitTorrent.
- Many software games, especially those whose large size makes them difficult to host due to bandwidth limits, extremely frequent downloads, and unpredictable changes in network traffic, will distribute instead a specialized, stripped down bittorrent client with enough functionality to download the game from the other running clients and the primary server (which is maintained in case not enough peers are available).
- Many major open source and free software projects encourage BitTorrent as well as conventional downloads of their products (via HTTP, FTP etc.) to increase availability and to reduce load on their own servers, especially when dealing with larger files.
Government
- The UK government used BitTorrent to distribute details about how the tax money of UK citizens was spent.
Education
- Florida State University uses BitTorrent to distribute large scientific data sets to its researchers.
- Many universities that have BOINC distributed computing projects have used the BitTorrent functionality of the client-server system to reduce the bandwidth costs of distributing the client side applications used to process the scientific data.
Others
- Facebook uses BitTorrent to distribute updates to Facebook servers.
- Twitter uses BitTorrent to distribute updates to Twitter servers.
- The Internet Archive added Bittorrent to its file download options for over 1.3 million existing files, and all newly uploaded files, in August 2012. This method is the fastest means of downloading media from the Archive.
CableLabs, the research organization of the North American cable industry, estimates that BitTorrent represents 18% of all broadband traffic. In 2004, CacheLogic put that number at roughly 35% of all traffic on the Internet. The discrepancies in these numbers are caused by differences in the method used to measure P2P traffic on the Internet.
Routers that use network address translation (NAT) must maintain tables of source and destination IP addresses and ports. Typical home routers are limited to about 2000 table entries while some more expensive routers have larger table capacities. BitTorrent frequently contacts 20–30 servers per second, rapidly filling the NAT tables. This is a known cause of some home routers ceasing to work correctly.
Technologies built on BitTorrent
The BitTorrent protocol is still under development and may therefore still acquire new features and other enhancements such as improved efficiency.Web seeding
Web seeding was implemented in 2006 as the ability of BitTorrent clients to download torrent pieces from an HTTP source in addition to the swarm. The advantage of this feature is that a website may distribute a torrent for a particular file or batch of files and make those files available for download from that same web server; this can simplify long-term seeding and load balancing through the use of existing, cheap, web hosting setups. In theory, this would make using BitTorrent almost as easy for a web publisher as creating a direct HTTP download. In addition, it would allow the "web seed" to be disabled if the swarm becomes too popular while still allowing the file to be readily available.This feature has two distinct and incompatible specifications.
The first was created by John "TheSHAD0W" Hoffman, who created BitTornado. From version 5.0 onward, the Mainline BitTorrent client also supports web seeds, and the BitTorrent web site had a simple publishing tool that creates web seeded torrents. µTorrent added support for web seeds in version 1.7. BitComet added support for web seeds in version 1.14. This first specification requires running a web service that serves content by info-hash and piece number, rather than filename.
The other specification is created by GetRight authors and can rely on a basic HTTP download space (using byte serving).
In September 2010, a new service named Burnbit was launched which generates a torrent from any URL using webseeding.
There are server-side solutions that provide initial seeding of the file from the webserver via standard BitTorrent protocol and when the number of external seeders reach a limit, they stop serving the file from the original source.
Multitracker
Another unofficial feature is an extension to the BitTorrent metadata format proposed by John Hoffman and implemented by several indexing websites. It allows the use of
multiple trackers per file, so if one tracker fails, others can continue
to support file transfer. It is implemented in several clients, such as
BitComet, BitTornado, BitTorrent, KTorrent, Transmission, Deluge, µTorrent, rtorrent, Vuze, Frostwire.
Trackers are placed in groups, or tiers, with a tracker randomly chosen
from the top tier and tried, moving to the next tier if all the
trackers in the top tier fail.Torrents with multiple trackers (called MultiTorrents by indexing website MyBittorrent.com) can decrease the time it takes to download a file, but also have a few consequences:
- Poorly implemented clients may contact multiple trackers, leading to more overhead-traffic.
- Torrents from closed trackers suddenly become downloadable by non-members, as they can connect to a seed via an open tracker.
Development
An unimplemented (as of February 2008) unofficial feature is Similarity Enhanced Transfer
(SET), a technique for improving the speed at which peer-to-peer file
sharing and content distribution systems can share data. SET, proposed
by researchers Pucha, Andersen, and Kaminsky, works by spotting chunks
of identical data in files that are an exact or near match to the one
needed and transferring these data to the client if the "exact" data are
not present. Their experiments suggested that SET will help greatly
with less popular files, but not as much for popular data, where many
peers are already downloading it. Andersen believes that this technique could be immediately used by developers with the BitTorrent file sharing system.
As of December 2008, BitTorrent, Inc. is working with Oversi on new
Policy Discover Protocols that query the ISP for capabilities and
network architecture information. Oversi's ISP hosted NetEnhancer box is
designed to "improve peer selection" by helping peers find local nodes,
improving download speeds while reducing the loads into and out of the
ISP's network.
Legal issues
Main article: Legal issues with BitTorrent
There has been much controversy over the use of BitTorrent trackers.
BitTorrent metafiles themselves do not store file contents. Whether the
publishers of BitTorrent metafiles violate copyrights by linking to
copyrighted material without the authorization of copyright holders is
controversial.
Various jurisdictions have pursued legal action against websites that
host BitTorrent trackers. High-profile examples include the closing of Suprnova.org, TorrentSpy, LokiTorrent, BTJunkie, Mininova, Demonoid and Oink's Pink Palace. The Pirate Bay
torrent website, formed by a Swedish group, is noted for the "legal"
section of its website in which letters and replies on the subject of
alleged copyright infringements are publicly displayed. On May 31, 2006,
The Pirate Bay's servers in Sweden were raided by Swedish police on
allegations by the MPAA of copyright infringement; however, the tracker was up and running again three days later.
In the study used to value NBC Universal in its merger with Comcast,
Envisional examined the 10,000 torrent swarms managed by PublicBT which
had the most active downloaders. After excluding pornographic and
unidentifiable content, it was found that only one swarm offered
legitimate content.
Between 2010 and 2012, 200,000 people have been sued for uploading and downloading copyrighted content through BitTorrent.
In 2011, 18.8% of North American internet traffic was used by
peer-to-peer networks which equates to 132 billion music file transfers
and 11 billion movie file transfers on the BitTorrent network.
On April 30, 2012 the UK High Court ordered five ISPs to block BitTorrent search engine The Pirate Bay.
BitTorrent and malware
Several studies on BitTorrent have indicated that a large portion of files available for download via BitTorrent contain malware. In particular, one small sample indicated that 18% of all executable programs available for download contained malware. Another study claims that as much as 14.5% of BitTorrent downloads contain zero-day malware, and that BitTorrent was used as the distribution mechanism for 47% of all zero-day malware they have found.
Several studies on BitTorrent have indicated that a large portion of files available for download via BitTorrent contain malware. In particular, one small sample indicated that 18% of all executable programs available for download contained malware. Another study claims that as much as 14.5% of BitTorrent downloads contain zero-day malware, and that BitTorrent was used as the distribution mechanism for 47% of all zero-day malware they have found.
1 comment :
This is really interesting, You are a very skilled blogger.
I have joined your rss feed and look forward to seeking
more of your excellent post. Also, I have shared your web site in my social networks!
Also visit my blog post ... Skin Whitening Forever - ,
Post a Comment