> For the complete documentation index, see [llms.txt](https://supernet.gitbook.io/supernet/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://supernet.gitbook.io/supernet/distributed-network/fault-tolerance.md).

# Fault Tolerance

<figure><img src="/files/Sfs3DevpONd6H4JaB2Dr" alt=""><figcaption><p>2D Reed-Solomon(RS) in Data Availability Sampling.</p></figcaption></figure>

### **AI Automated Polling Detection**

By introducing machine learning algorithms, polling detection not only relies on scheduled checks but also incorporates anomaly detection models. This allows the system to predict and diagnose potential faults based on historical data.

For smaller-scale systems, polling detection can be set to run every 5 minutes, while for large-scale, high-concurrency systems, it is recommended to conduct checks every 30 seconds to 1 minute.

The initial accuracy rate can reach 90%-95%, and as the model continues to iterate, the precision can gradually increase to 99%.

The Mean Time to Repair (MTTR) should be kept within 5 minutes, with most automatic recovery operations completed within 1 minute.

### Distributed Validators

The task allocation of validators typically relies on consensus mechanisms such as PBFT, Raft, or BFT-SMaRt, ensuring that all validators participate in validating each transaction and data update in each round, maintaining the validity and consistency of the data.

To ensure system reliability and data consistency, the number of distributed validators should be at least five to provide sufficient redundancy and prevent single points of failure. In high-security scenarios, the number of validators can be increased to 10-15 to reduce the risk of malicious attacks through majority validation.

The system utilizes prioritized validator allocation, assigning tasks to healthy and lightly loaded validators to improve accuracy and efficiency. The overall system accuracy and effectiveness should reach 99.9%.

### **Data Sampling Validation and Snapshots**

Data sampling validation is performed using Data Availability Sampling, where light nodes and full nodes communicate to verify data consistency. The consistency check is based on the sampled data points, and cryptographic hash algorithms are used to compare the data for consistency. In high-reliability systems, the accuracy of each sampling validation should be maintained at over 99.9%.

The system can sample and validate between 5% and 10% of the data to ensure the representativeness and validity of the system's data. For smaller systems with fewer data points, the sampling rate can be increased to 20%-30% to ensure adequate coverage for validation.

The system will periodically create snapshots to ensure data recoverability. Snapshot frequency is typically set between every 10 to 30 blocks, and the exact frequency should be determined based on the rate of system changes and the importance of the data.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://supernet.gitbook.io/supernet/distributed-network/fault-tolerance.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
