A new combination of Bayesian methods for studying the role of DOM in water quality and greenhouse gases
Description
This method is not a traditional Bayesian network; instead, it innovatively integrates Bayesian network analysis into environmental analysis, allowing for the examination of how DOM and water quality influence greenhouse gas emissions while distinguishing between direct and indirect effects. The resulting probability table makes it possible to observe changes in greenhouse gas emissions under specific conditions in a more intuitive way. (Similarly, this approach can be applied to scenarios in which multiple independent variables affect a single dependent variable.) The method also employs a novel analytical framework, which gives it greater practical significance than traditional Bayesian methods. For further details, please refer to the paper and its methodological description.
Files
Steps to reproduce
The method first applies variance inflation factor (VIF) analysis to reduce dimensionality and address multicollinearity, ensuring that only key factors are retained for subsequent analysis. This prevents the network from becoming overly complex due to an excess of factors, which would obscure critical issues. A random forest is then used to rank the retained factors by importance, and the top 3–5 are selected as direct influences on greenhouse gas emissions—replacing traditional approaches such as mutual information or manual ranking—while the remaining factors are treated as indirect influences. The data are subsequently discretized (the specific method depends on the data characteristics), and the network is scored using BDeui. The Hill Climbing algorithm is applied to add edges that maximize the BDeui score, and Bootstrap is employed to evaluate stable edges (facilitating the analysis of indirect factors). Finally, a conditional probability table is generated. This method relies entirely on the data, using computational learning to construct the network. Compared with traditional Bayesian networks, which often depend on subjective judgments, this approach enhances data mining and analytical capabilities. Moreover, by integrating the techniques described above, the relationships among these factors can be determined more thoroughly, rapidly, and accurately. For more details, see the paper.
Institutions
- Chinese Research Academy of Environmental SciencesBeijing, Beijing