Disinfecting drinking water has played a central role in protecting societies from deadly waterborne illnesses for more than a century. By killing harmful microorganisms such as bacteria, viruses, and parasites, water treatment prevents the spread of diseases that once swept rapidly through communities. Even water that appears crystal clear can contain invisible pathogens capable of causing severe and sometimes fatal illness, particularly among children, older adults, and those with weakened immune systems. Before modern sanitation systems were introduced, outbreaks of cholera, typhoid, and dysentery were frequent and devastating, claiming countless lives and crippling entire cities. For this reason, the widespread disinfection of drinking water is considered one of the most important public health achievements in human history.
However, the chemical processes that make water safer also produce unintended chemical reactions. Disinfectants such as chlorine and chloramine interact with naturally occurring organic matter found in rivers, lakes, and underground aquifers. This organic material consists of tiny particles of dissolved carbon-based compounds that are present in nearly all natural water sources. When disinfectants come into contact with this matter, they form substances known as disinfection byproducts, or DBPs, some of which have raised long-term health concerns among scientists and regulators.
Certain DBPs, including trihalomethanes and haloacetic acids, have been linked in studies to increased risks of bladder cancer and developmental problems during pregnancy. The Environmental Protection Agency has established safety limits for a small number of these compounds in public water supplies. Yet many others remain unregulated. According to Tao Ye, an assistant professor at Stevens Institute of Technology, only 11 byproducts are currently monitored under federal standards, even though researchers have already identified several hundred additional compounds that may form during water treatment. Some of these lesser-known chemicals could potentially be more harmful than those already regulated.
Testing such a vast number of substances using traditional laboratory methods is extremely difficult. Toxicity experiments require specialised equipment, long exposure periods, and significant financial resources, which limit how many chemicals can realistically be studied. To overcome this barrier, researchers are increasingly turning to artificial intelligence. Machine learning models can analyse large datasets and identify patterns between chemical structures and toxic effects, allowing scientists to predict the potential danger of compounds that have never been tested in the lab.
To advance this approach, Ye and his doctoral student, Rabbi Sikder, collaborated with Peng Gao from the Harvard T.H. Chan School of Public Health to develop an AI model focused specifically on disinfection byproducts. They compiled experimental toxicity data from previous scientific research covering more than 200 known chemicals, including their molecular structures and exposure conditions. This information was used to train the model to estimate toxicity levels for other byproducts whose health effects remain unknown.
The AI system ultimately predicted the toxicity of more than 1,100 disinfectant byproducts and identified several that may pose significantly higher risks than chemicals currently regulated. Some were estimated to be between two and ten times more toxic than existing EPA-monitored substances. The findings were published in Environmental Science & Technology Letters in January 2026, highlighting a number of compounds that could become priorities for future research and regulation.
Despite these discoveries, the researchers emphasise that drinking water remains safe for the public. The compounds identified represent a broad range of substances that may form under different environmental conditions, not a mixture present in everyday tap water. Water sources and treatment methods vary widely across regions, meaning different byproducts appear in other places. The purpose of the research is to improve understanding, guide smarter regulation, and further enhance water safety. For individuals who wish to reduce exposure even further, household water filters and boiling can help remove inevitable byproducts. As AI continues to refine these predictions, scientists hope it will become a powerful tool in ensuring clean, safe drinking water for generations to come.
More information: Rabbi Sikder et al, Multi-Endpoint Semisupervised Learning Identifies High-Priority Unregulated Disinfection Byproducts, Environmental Science & Technology Letters. DOI: 10.1021/acs.estlett.5c01145
Journal information: Environmental Science & Technology Letters Provided by Stevens Institute of Technology
