Abstract#
With the rapid development of internet technology, the speed and scope of information dissemination have significantly increased. Emerging media such as social media, blogs, and video platforms allow individuals and groups to easily share opinions and information. This free flow of information promotes public participation, social movements, and the democratization process. However, the openness of the internet also brings challenges in information control. Governments in many countries and regions have gradually strengthened the censorship of online content for the sake of national security, social stability, and political control. This censorship often manifests as blocking specific websites, monitoring social media content, and restricting speech.
This article will explain the principles of two common blocking techniques and provide detection ideas: DNS Cache Poisoning and SNI Blocking. Additionally, we will provide corresponding PoC (Proof of Concept) tools to facilitate the detection of whether your network and target domain names are affected by these two blocking methods. It is important to note that the methods provided have not been extensively tested, so they may be effective in some countries or regions using these censorship techniques, but may not work properly in others.
This article does not discuss methods to bypass blocking techniques, nor does it address other unmentioned blocking and censorship measures. The discussion of DNS cache poisoning is limited to the UDP-based DNS protocol; TCP-based DNS can be experimented with independently.
The detection methods in this article do not require the experimenter to deploy additional servers and are a principle detection method.
Glossary#
Domain Name: A domain name is an easy-to-remember string used to identify a specific location or resource on the internet. The existence of domain names is to solve the problem of IP addresses being inconvenient to remember. For example, www.fubao.dev is a domain name.
DNS (Domain Name System): The domain name system. The access to the target domain name ultimately still uses an IP address; this system is akin to a phone book, where you can look up a phone number by name, and through the domain name system, you can query an IP address by domain name. For example, the IP address corresponding to www.fubao.dev is 34.41.105.151.
SNI (Server Name Indication): Server Name Indication (SNI) is an extension of the TLS protocol. At the beginning of the handshake process, the client informs the connected server of the target host name. This allows the server to present multiple certificates on the same IP address and TCP port, thus supporting multiple secure (HTTPS) websites or other TLS-based services on the same IP address without requiring all sites to use the same certificate.
In simple terms, one IP address can host multiple domain names. When accessed via HTTPS, if the legitimacy of the certificate needs to be verified, the client must inform the server which domain's certificate to provide through SNI to ensure a smooth handshake process. Although the server may only focus on the Host information at the HTTP layer, SNI information is typically sent during the handshake for compatibility.
SNI information is divided into two categories: one is plaintext SNI information, which is the most common and has good compatibility; the other is encrypted SNI (ESNI), but this article does not discuss ESNI.
DPI (Deep Packet Inspection): Deep Packet Inspection technology is a network packet filtering technology designed to analyze the content of packets passing through a detection point, including their data parts and possible headers. This technology is used to identify non-compliant protocols, viruses, spam, and intrusion behaviors. Based on preset standards, DPI can determine whether a packet is allowed to pass through or routed to another destination. Additionally, DPI can be used to collect statistical data on network traffic for monitoring and targeted advertising. For example, for some DPI systems, SNI information and DNS query target information are the data that need to be collected and monitored.
Typically, for network blocking technologies, DPI systems also need to interact with some services that can execute specific blocking actions. Large DPI systems and their execution units generally adopt a mirrored traffic and bypass deployment method to avoid affecting normal networks, achieving the goal of blocking communication by injecting specific packets. Such systems are commonly referred to as packet injectors.
DNS Pollution: After DPI detects queries to blocked domain names, it uses packet injectors to respond to DNS responses. Because DPI devices are closer to the querying party, the injected packets arrive first; for some advanced DPI and action systems, they may also discard subsequent correct response packets.
SNI Blocking: When DPI detects SNI information in HTTPS requests containing blocked domain names, it injects TCP RST packets to sever the connection between the requester and the target. For some specialized DPI systems, this RST blocking packet is mostly bidirectional, meaning it informs the client that the server has actively disconnected the connection and tells the server that the client has actively disconnected the connection, ensuring that some simple methods of discarding RST packets to bypass blocking do not work.
Censorship Residue: For some DPI systems, the blocking logic based on the application layer and other layers may interact. For example, if you access a target website using a specific IP and SNI, the next time you connect to this IP, the handshake packet may be discarded; typically, systems used to discard packets are not bypass deployments. Censorship residue provides convenience for studying how some DPI systems work but also brings some inconveniences, which will be mentioned in this article but not discussed extensively.
Fishing: Common Weaknesses of DPI and Packet Injectors#
For DPI and packet injection systems in large-scale data scenarios, throughput and processing capability are important metrics. To balance cost and effectiveness, DPI systems and packet injectors typically have the following shortcomings:
-
The implementation of the target protocol stack is usually lightweight, being a minimal usable subset of the complete protocol stack, and typically does not fully comply with RFC implementations, due to cost considerations. For example, DPI systems used for TCP do not verify whether the TCP packet's checksum is valid;
-
DPI and packet injectors are bypass deployments and can only see packets that pass through them, resulting in incomplete information collection. For example, DPI systems cannot know how many hops remain to the target, making them easily deceived. For instance, if a packet with a TTL value that can just pass through the DPI and packet injector but cannot reach the target is sent, the DPI and packet injector will also start working; although many DPI and packet injectors will assume the number of hops to the target, this assumption cannot cover carefully crafted TTL value packets.
-
DPI may reassemble data streams of protocols transmitted via TCP and other streaming methods, but there is always a waiting time limit.
-
etc.
Identification of DNS Pollution#
The usual method for identifying DNS pollution is to compare with the correct results, but this method lacks universality. Due to the existence of ECS and CDN technologies, DNS servers may return the IP information of the server closest to the requester to ensure the best access experience. The target service may use different vendors' services in different regions and countries, and this detection method may result in a large number of false positives.
Combining the common weaknesses of DPI systems, since DPI systems do not care whether requests are legitimate, we can construct an illegal DNS query that informs the DNS server that our query request contains the query result.
A normal DNS query is shown in the figure below:
We construct an illegal query, modifying the circled part in the figure:
If the illegal query is not answered (i.e., not polluted), we see that the result is a DNS server error:
If the illegal query is answered (i.e., DNS is polluted), we see a normal response result:
PoC: dns_pollution.py
Usage example:
Detection of SNI Blocking#
To detect SNI blocking, we need to exclude the influence of DNS pollution. Since DPI systems typically use limited information, we use some outbound IP addresses that are guaranteed to be accessible, send requests with SNI, and check whether we receive RST packets, without needing to complete the full HTTPS handshake.
PoC: sni_block.py
Usage example:
Conclusion#
Based on the above ideas, there are many variations of detection methods. Enjoy experimenting!