Archive

Posts Tagged ‘wikipedia’

UK censors websites, blocks access

December 7th, 2008 No comments

The blocking access to websites due to their content is something common in developing countries. However, we now see examples in the UK.

Apparently, the Internet Watch Foundation maintains a list of over 1000 URLs with content that it deems is not suitable. They distribute this list to the British Internet Service providers, which (apparently independently) implement blocking techniques to the access of those web resources. For more, see this article or this AP article.

In some cases, the ISP would show an informative page that the web resources is blocked, while in other cases, the page shows misleading content. If you are in the UK, the page that is blocked is this (Wikipedia page on a 1976 Scorpions music album), a page of the Wikipedia online encyclopaedia.

In one example, with the VirginMedia ISP, the blocking facility returns a fake error page,

<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 2.0//EN”>
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /wiki/Virgin_Killer was not found on this server.</p>
<p>Additionally, a 404 Not Found
error was encountered while trying to use an ErrorDocument to handle the request.</p>
</body></html>

VirginMedia uses what is called a transparent proxy for those IP addresses that host the questionable URLs. What this means is that when you visit any page that belongs to the affected webserver, all access information goes through a special server that has the ability to keep logs and also make changes in-place. In this specific case of blocking, we know that the transparent proxy server at least fakes the reply of the web server.

The blocking of access often has technical side-effects. The side-effects of the British blocking is that when you access any other page of the affected website, that webserver sees that the computer accessing appears to have the identity of the transparent proxy server. For VirginMedia, the transparent proxy server IP address is 62.30.249.131, so what Wikipedia sees is that the majority of UK visitors appear behind the specific single IP address.

Wikipedia sees vast majority of UK visitors to come from a selected few IP addresses.

Wikipedia sees vast majority of UK visitors to come from a selected few IP addresses.

It is possible to figure out which other websites are blocked. For example, we can search the Web for occurrences of the IP address 62.30.249.131.

After some investigation, we verified that the IP address 68.180.151.74 is also in the blocking list. When someone visits any web server that is hosted on that IP address, the web server registers that any user from Britain comes from a handful IP addresses.

If the web server offers Internet advertising, then the ad provider will register a huge increase of clicks from a single IP address which may trigger by accident the fraud protection. It is possible to come up with a big range of such scenarions. However, for the IP address 68.180.151.74, is it a big issue? Well, if we read carefully at 68.180.151.74, we notice that the address belongs to Yahoo, and it hosts over 320,000 web servers! Obviously, I have not been able to figure out which URL from 68.180.151.74 is the one that was blocked.

One can find more technical information about the blocked IP addresses. Using the tracert (traceroute) command, we notice difference in the route that our packets take to reach the affected destinations.

Here is a normal packet route when accessing www.google.com,

  Host
  ...........
 5. so-1-1-0.0.mcr-cor-001.bddsl.net
 6. popl-t3core-1a-ge-410-0.network.virginmedia.net
 7. man-bb-a-as0-0.network.virginmedia.net
    pop-bb-a-as2-0.network.virginmedia.net
 8. pop-bb-b-ae0-0.network.virginmedia.net
 9. tele-ic-2-as0-0.network.virginmedia.net
10. 212.250.14.138
11. 209.85.252.76
12. 216.239.43.123
13. 72.14.233.77
    72.14.233.79
14. 209.85.249.129
    209.85.249.133
    216.239.43.30
    216.239.43.34
15. nf-in-f99.google.com

On stage 11, the route changes to a Google server. Up to stage 10, the route goes through VirginMedia servers.

How does this look for Wikipedia?

 Host
 ...........................................................
 5. so-1-1-0.0.mcr-cor-001.bddsl.net
 6. popl-t3core-1a-ge-410-0.network.virginmedia.net
 7. 213.105.175.1
    pop-bb-a-as2-0.network.virginmedia.net
 8. man-bb-b-ae0-0.network.virginmedia.net
    win-bb-b-so-010-0.network.virginmedia.net
 9. bir-bb-a-so-010-0.network.virginmedia.net
    bir-bb-a-so-100-0.network.virginmedia.net
10. florence-pos20.network.virginmedia.net            
11. cancun-pos50.network.virginmedia.net              
12. puebla-pos90.network.virginmedia.net             
13. rabat-pos90.network.virginmedia.net               
14. osr02know-tenge73.network.virginmedia.net         
15. wb7301a.network.virginmedia.net                   
16. osr-hsd-gw3-ge147.network.virginmedia.net        
17. osr-hsd-gw4-tenge82.network.virginmedia.net       
18. XSR03.Asd002A.surf.net
19. AE1.500.JNR01.Asd001A.surf.net
20. KNCSW001-router.Customer.surf.net
21. 4ge-1-16.csw1-knams.wikimedia.org
22. rr.knams.wikimedia.org

The other blocked IP address shows

 Host                                               
.......
 5. so-1-1-0.0.mcr-cor-001.bddsl.net                
 6. popl-t3core-1a-ge-410-0.network.virginmedia.net 
 7. pop-bb-a-as2-0.network.virginmedia.net          
    man-bb-a-as0-0.network.virginmedia.net
 8. win-bb-b-so-010-0.network.virginmedia.net       
    man-bb-b-ae0-0.network.virginmedia.net
 9. bir-bb-a-so-100-0.network.virginmedia.net       
    bir-bb-a-so-010-0.network.virginmedia.net
10. florence-pos20.network.virginmedia.net          
11. cancun-pos50.network.virginmedia.net            
12. puebla-pos90.network.virginmedia.net            
13. rabat-pos90.network.virginmedia.net             
14. osr02know-tenge73.network.virginmedia.net       
15. wb7301a.network.virginmedia.net                 
16. osr-hsd-gw3-ge147.network.virginmedia.net       
17. gsr-hsd-gw1-ge10.network.virginmedia.net        
18. 213.228.222.10                                  
19. ae1-p141.msr1.sp1.yahoo.com                     
20. ge-1-43.bas-b1.sp1.yahoo.com                    
21. p2p.geo.vip.sp1.yahoo.com

In the case of both the blocked IP addresses, we see a set of additional routers.

When trying any other IP address, we do not see those highlighted routers appearing.

The problem with blocking the access to websites is that such measures, apart from causing a series of technical problems, they do not have the desired positive effect. The botched attempt by VirginMedia to show a misleading error page for those blocked addresses is an example of knee-jerk reaction.

Any attempt towards blocking access to webpages or websites should get feedback first from the community. The current incident against Wikipedia puts the IWF in a bad light, and shows limited confidence that they can fulfil their goals.

Update #1: Wikipedia page censored in the UK for ‘child pornography’ (The Guardian)

Update #2: Great Firewall of Britain (The Nock Blog)

Update #3: Interview on BBC Radio with representatives from the IWF and the Open Rights Group.

Update #4: Amazon US under threat as internet watchdog reconsiders Scorpions censorship (The Guardian)

Update #5: Cory Doctorow: How to make child-porn blocks safe for the internet