First, we need to understand how the GFW blocks our traffic#
-
IP Blackhole: Currently unsolvable, but only affects certain services, such as Google services (Google, Twitter, YouTube, etc.)
-
DNS Pollution: Returns a fake IP for a domain name. Use the hosts file to force a specific IP for a domain or use encrypted DNS (DoH, DNS signatures, etc.)
-
HTTP Hijacking: Since the traffic is not encrypted, the GFW, acting as a natural man-in-the-middle, can directly tamper with it (e.g., redirecting to a 404 page, hijacking to an anti-fraud page, etc.). You can use HTTPS connections to avoid this, but you may encounter SNI blocking.
-
SNI Blocking: Before establishing an encrypted connection between the client and the server, the client sends a
Client Hello
message, which is in plaintext and generally carries theserver_name
. The GFW can know which website you are trying to access and block domains not on the whitelist (e.g., discord.com). Sinceserver_name
is actually an extension and not mandatory, you can avoid SNI blocking by not sending it.
Now, let's analyze the GFW's blocking situation for different websites#
We use WireShark for packet capturing.
-
First, try to access
www.baidu.com
, which is a domain not blocked by the GFW.-
Let's ping it first.
-
Get the IP:
2408:873d:22:18ac:0:ff:b021:1393
-
Force binding through Hosts.
-
Using WireShark for packet capturing, we can see that the
Client Hello
sent by the client clearly shows theServer Name
field, and it can also receive theServer Hello
normally, after which both parties begin communication.
-
Check the browser, the website is accessed normally.
-
-
Let's try to access
discord.com
.-
Let's ping it first, and we can find that both the domain and the resolved IP are unreachable.
-
At this point, we try to use
itdog.cn
for v4 ping and ping the resolved domain in sequence.
-
It can be seen that the first IP is reachable.
-
Force binding Hosts and try to capture packets.
-
- It can be seen that after forcing Hosts binding, when the client sends the
Client Hello
, the GFW detects theServer Name
field, and then the GFW sends aRST
message to the client, which requests to reset the client connection. On the client side, anERR_CONNECTION_RESET
is received, meaning the connection has been reset. The user cannot access the webpage.
Next, try sending an empty Server Name
message#
Successfully accessed. The Server Name
field was not found in WireShark.
The Killer Move, tcpioneer#
It modifies TCP packets in such a way that the GFW cannot detect them, and WireShark cannot capture the Client Hello
message, but it can still establish a connection, meaning the server sends a Server Hello
.