Cleanfeed: the facts

This article is a writeup of notes of a meeting with Mike Galvin, Director
of Internet Services for BT Retail, where he discussed BT’s Cleanfeed project.
He asked that his comments be treated as “private to the ISP community” but “not confidential; not under NDA”.

Purpose

  • BT’s Cleanfeed is a filter designed to prevent casual web access to URLs
    listed on the Internet Watch Foundation’s “Child Abuse Images” database.

Name

  • “Cleanfeed” was an internal codename used at BT, and which was leaked to the media by a third party. It is used as a trademark by other companies
    in various other contexts.

The database

  • The database provided by the IWF that is used for Cleanfeed contains URLs
  • BT has agreed with IWF that this database should contain
    • URLs for images that the IWF considers to be indecent photographs or
      pseudo-photographs of children (”illegal images”)
    • URLs for web pages that embed such illegal images
  • BT has agreed with the IWF that this database should not contain
    • URLs that might in themselves be considered an “illegal advertisement”
    • URLs for obscene material that does not involve children
    • Generally, URLs for any content that is legal to possess
  • In principle generic proxy services will not be added to the database, although web sites that specifically provide proxy access to illegal images will not be excused inclusion

Technical design of the filter

  • Cleanfeed operates as a two stage filter on Internet traffic.
    • At stage one, Internet traffic destined for port 80 with an IP address “on the list” is diverted to a stage two filter; all other traffic passes normally across BT’s network. Strictly speaking, the relevent IP addresses are the IP address for which result from a DNS lookup of the host address contained in a URL in the database.
    • At stage two, of the subset of Internet traffic that has been diverted, that which is an HTTP (web) request is examined for the requested URL.
    • If the URL of the web request exactly matches a URL in the database
      BT impersonates the destination web server and returns a HTTP 404 status code, which will cause a web browser to display a “Not found” message to the user.
    • Otherwise, the traffic is passed across BT’s network for a normal response
      from the destination.

Filter design limitations

  • Cleanfeed will not block FTP, peer to peer, e-mail MIME attachments, or any application protocol other than HTTP, regardless of which port is used.
  • Cleanfeed will not block web traffic carried over any port other than port 80; this is the standard port for unencrypted web traffic, although other ports are also used. However traffic to other ports could be examined if a relatively minor configuration change were made to the system.
  • Cleanfeed will not block encrypted web traffic.
  • Cleanfeed will not block access attempts made through proxy services unless the proxy is itself contained in the database.

Deployment at BT

  • The first-stage filter currently only diverts traffic from BT’s own retail customers.
  • BT can, on request, apply Cleanfeed for wholesale customers using its BT
    Net or BT Central Plus products.
  • BT considers that it is pointless for its peering partners to ask for Cleanfeed to be applied to their traffic where such partners have other peering relationships. Such ISPs should, if they wish to use Cleanfeed, install their own deployment.

Costs

  • BT has calculated the cost of hardware and software license fees in the system as deployed at BT to be £327,000.
  • BT estimates that the total repeatable deployment cost, including the above but excluding original research and development, was around £500,000.
  • BT estimates that it spent around another £500,000 in original R&D on this project.

Technical risk factors and mitigation

  • BT considers that there is a high probability that attempts will be made to attack Cleanfeed, as denial of service attacks on the system or as attempts to compromise the integrity or confidentiality of the database.
  • BT has attempted to hide the physical and logical location of the Cleanfeed system within its network.
  • The database is received from IWF in encrypted form, is stored in encrypted form even when in use, and is only decrypted so as to produce the list of IP addresses for the first-stage filter, and to transmute it from one encryption standard to another.
  • BT has closely compartmentalised detailed knowledge of Cleanfeed within the company. In particular, BT has taken every effort to ensure that none of its staff are capable of gaining access to the list of URLs, regardless of corporate authorisation.
  • BT does not log the IP addresses of users whose traffic is an attempt to reach a listed URL.

Quality assurance

  • BT has agreed with IWF and the Home Office that an independent academic audit shall be made of IWF’s processes and procedures so as to ensure that it is capable of meeting the standards that it has set for itself.
    • As this is an audit of procedures this shall not include examination of any individual decisions with regard to particular URLs.
    • The audit report will be made to the Chair and Chief Executive of the IWF. The Home Office will receive a copy of the audit report. BT will not receive a copy.
  • BT has agreed with the IWF and the Home Office that an independent appeals process shall be created for use by persons whose own sites appear on the IWF’s database and who wish to challenge that designation.
    • The adjudicator shall be appointed by POLIT, a unit within the National
      Crime Squad dedicated to online paedophilia.
    • BT will not confirm whether a particular site is indeed listed (and, indeed, does not know). The “HTTP status code 404″ response is specifically designed to hide from the user that access has been blocked.

Publicity

  • The first news of the Cleanfeed project was leaked to the media by a children’s charity with which BT had been discussing the project; this leak was very unwelcome to BT and extremely unhelpfully timed, coming as it did just as BT began testing the system.
  • The figure of “20,000 hits per day” was leaked to the press but BT is unwilling to say how. It represents the number of HTTP requests that are actually blocked by Cleanfeed i.e. the number of fake 404 error codes generated. BT believes that search engine listings, spam, pop-ups and links from adult pornography sites probably represent a substantial portion of this number, as opposed to deliberate attempts to access child pornography.

Policy risks and mitigation

  • Scope creep is a serious risk
    • The Home Office originally indicated to BT that Cleanfeed might be employed to block access to other undesirable content
    • Wannadoo has already been approached by the British Phonographic Industry (BPI) about implementing a system similar to Cleanfeed so as to block access to works allegedly infringing copyright.
    • BT says that if the pressure to extend the scope of Cleanfeed became too great it would simply cancel the project.
  • Mere conduit may be at risk
    • BT is satisfied that in diverting traffic to a filter that may block access depending on the results of a URL match it does not “select the recipient of the transmission” within the meaning of the E-Commerce Regulations. However some outside BT believe that this cannot really be known either way until it is tested in court; if BT were wrong on this issue it could lose its “mere conduit” defence and would potentially face liability for all the traffic on its network (not just traffic that it blocked).
    • If BT faced an adverse finding on this issue Cleanfeed would be terminated.
      BT is unlikely to be the defendent of choice for a copyright holder or other party attempting to hold an ISP legally responsible for Internet traffic.
Posted by malcolm on Friday, September 10th, 2004 at 2:47 pm. RSS feed for comments on this post.Both comments and pings are currently closed.

Comments are closed.

Choose from Full RSS or comments RSS feeds.
LINX Public Affairs is powered by WordPress and delivered to you in 0.548 seconds.
Designed by Matthew and built from Kubrick. Administrator login and new user registration.