The shorter version:
1. Bayesian graphical models are new to me.
2. I want to use R to model spatial variation in county level crime using a BGN. I have been working with bnlearn, and would ideally like to continue using bnlearn.
3. My data is hierarchical, such that county level variables are nested in state level variables.
4. I would like to account for this hierarchical structure in my BGN.
5. I have not been able to find any worked examples of how to do this.
6. I am wondering if I could make each county level variable the descendent of a node that indicated state membership. Thus, variable X would be dependent on the state membership, for each variable X that is a county level variable.
7. Is this an acceptable approach? Are there better approaches? If there are better approaches, how can they be implemented in R?
The longer version:
I have a pretty strong background in traditional frequentist statistics and hierarchical modeling, and I have a weaker but extant familiarity with bayesian methods for parameter estimation. However, do to what seem to be some discourse differences, the transfer this knowledge to BGNs has been less fluid than I anticipated.
Currently, I am working on a project that involves predicting spatial variation in crime at the county level. In order to better understand BGNs and to exploit their strengths, I would like to represent my model as a BGN. However, I am having trouble figuring out how to account for the hierarchical structure in my data.
I have read several papers on hierarchical bayesian networks (this was a particularly clear one: http://www.cs.bris.ac.uk/Publications/Papers/1000650.pdf). However, I do not know and have been unable to find any examples or tutorials regarding how to build one in R.
My request, here, is for any information that will help me represent the hierarchical structure of my data in a BGN using R. Further, while theory is always helpful, what I am really looking for is practical examples of how to do this.
A few side notes:
My suspicion is that I am just not thinking about this problem properly.
I faced a similar issue when trying to decide how to account for the spatial autocorrelation in my data. While doing so within a regression framework generally requires (statistically) non-trivial model adaptations; I eventually found a number of papers that model spatial autocorrelation in a BGN by specifying nodes representing the neighbor mean on variable X as the parent of the node for X for every X that might have autocorrelation.
I am honestly not sure how sufficient this relatively simple approach is, but I am wondering whether representing hierarchical structure, at least to get started, might not be accomplished similarly.
For example, I want to reflect the fact that counties are nested in states in my data. For each county node, could I then declare a parent node indicating which state that county was in? Could this be at least a starting place? Or, are their principles or assumptions that this might risk violating?
Thanks in advance for your thoughts.
I am also relatively new to Bayesian Belief Networks (BBNs) and have tried to answer this myself. Without having data to work with, I thought it was worthwhile to mention M Lappenschaar et al. as a useful reference. Although you may have already come across this article, it has a great overview of the need for multilevel considerations in BBNs, with good examples.
Based upon this paper, I believe you answered your own question, which is the structure of the DAG is important to ensure the multilevel aspect is considered. From the paper: "the BN is constrained in the sense that no edges exist from a lower-level variable to a higher-level variable", which you can see in the images below. Based upon this information, I believe you can likely implement the BBN of your choosing using bnlearn (in fact, the authors of this paper used bnlearn), you just need to constrain the arcs as is specific to your application.