### The model

Nutrition outcome differences between girls and boys can be attributed to differences in nutrient inputs in the health production function which in turn can be due to familial child gender preferences. Unlike the unitary model by Becker (1965), the collective model, nutrients allocation analysis, neither assumes homogenous parental preferences nor incorporates individual preferences into a single household utility function; it assumes a stable decision process which gives Pareto-efficient allocation within a household. Each Pareto frontier associates with various decision procedures connecting different sets of individuals’ weight (Chiappori, 1997). Few works using this approach confirm that there is a bias towards girls’ nutrition in a sense that Pareto weights are biased between boys and girls (Dercon and Pramela, 2000; Quisumbing, 2003; Duflo, 2000; Haddad and Hoddinott, 1994; Thomas, 1990).

To state the theoretical explanation of intra-children resource allocation, let *H*_{it} be nutrition outcomes of child *i* at time *t*. The child health production function is dependent on the set of inputs denoted as ‘‘*I*’’ which includes nutrient consumption, mother and father’s time for childcare which also is dependent on observable/unobservable characteristics of the child (such as age, intimacy, and gender), and other household and community level variables. We present the household utility maximization problem as a function of child nutrition as follows:

$$ \max U\left({\mathrm{H}}_{it}\left({c}_{it}\right),{\chi}_{it}\right) $$

(1)

where *c*_{it} represents child *i*’s consumption of goods and home-produced child health inputs and nutrients; *χ*_{it} represents parent’s consumption and household characteristics such as parent’s education level, community level covariates such as access to the road. Health outcome of a child, Η_{it}, is dependent on the child’s nutrient intakes and other inputs which in turn are influenced by the parent’s child gender preference and an aggregate household level environmental risk. Here, we assume every change in wealth of the household has an equal effect on nutrition of all household members. Then, we can put the relationship of child’s nutrient consumption versus child gender preference and aggregate household environmental risks as

$$ {c}_{it}\left({\phi}_i,{\theta}_t\right) $$

(2)

where *ϕ*_{i} and *θ*_{t} represent parent’s child gender preference at all time, and an aggregate household level environmental risk index respectively. Both *ϕ*_{i} and *θ*_{t} are household level effects to child nutrition outcomes (i.e., individual level). Here, the nutrient inputs decision factors are likely to be different between child genders. In addition to the altruistic behavior of the household members, we assume that parents are the only allocation decision-makers so only the parent’s preference is incorporated in the household utility function. Putting (1) as the weighted sum of parent’s utility, it gives the following algebraic expression:

$$ \operatorname{Max}\ {U}_t={\omega}_i{U}_{ft}\left({C}_{ft},{\mathrm{H}}_{it}\right)+\left(1-{\omega}_i\right){U}_{mt}\left({C}_{mt},{\mathrm{H}}_{it}\right) $$

(3)

Based on the cooperative optimization framework, parents (mother and father) agree to assign welfare weight level to the individuals in the household. We changed *f* and *m*, subscripts which represent the father’s and mother’s consumption to *i* just to include consumption of every individual member in the household. Therefore, Eq. (3) can again be restated as

$$ \underset{\left\{{c}_{it}\right\}}{\max }{\sum}_{i=1}^I{\omega}_i{U}_t\left({C}_{it}\left({\phi}_i,{\theta}_t\right),{\mathrm{H}}_{it}\right) $$

(4)

Subject

and 0 ≤ *t* ≤ *T* is mother and father’s available time devoted to childcare.

*C*_{t}(*ϕ*_{i}, *θ*_{t}) is the aggregate consumption given the aggregate household level shock index, *θ*_{t}, and child gender preferences, *ϕ*. *C*_{it} is the individual consumption and nutrient consumption in kids’ case.^{Footnote 3} The aggregate consumption, *C*_{t}(*ϕ*_{i}, *θ*_{t}), is the summation of all individual consumptions which is less or equal to the household disposable income^{Footnote 4} *Y*_{t}.

The non-negative value of *ω*_{i} is the Pareto weight, assumed to be consistent over time, allotted to individual members by the social planner so that resource is allocated based on the weight given to boys and girls (Browning and Chiappori 1998). Our concavity assumptions *U*^{′}(Η_{it}) > 0 and *U*^{′′}(Η_{it}) < 0 show that the utility function is an increasing function.

Applying the Lagrange multiplier technique to Equation (4) with respect to *C*_{it}(*ϕ*_{i}, *θ*_{t}) and with the fact that summation of the pooled household income is greater or equal to the sum of household consumption as is in Equation (5):

$$ {\sum}_{i=1}^I{y}_{it}\left({\theta}_t\right)\ge {\sum}_{i=1}^I{C}_t\left({\theta}_t\right) $$

(6)

then, the marginal utility function is

$$ {\omega}_i{U}_i^{\prime}\left({C}_{it}\left({\phi}_i,{\theta}_t\right),{\mathrm{H}}_{it}\right)=\lambda \left({\theta}_t\right) $$

(7)

which after some derivation steps, it gives

$$ \frac{U_i^{\prime}\left({C}_{it}\left({\phi}_i,\kern0.5em {\theta}_t\right),{\mathrm{H}}_{it}\right)}{U_j^{\prime}\left({C}_{jt}\left({\phi}_i,\kern0.5em {\theta}_t\right),{\mathrm{H}}_{it}\right)}=\frac{\omega_j}{\omega_i} $$

(8)

Equation (8) the is parent’s optimal level of utility obtained from the welfare outcomes of two individuals in the household which indicates that Equation (8) holds true if there is no bias in resource allocation.

Of interest here is the nutrition of non-working age group of children. As we have noted earlier, the nutrition achievements of children, in terms of *z*-score, in the same household can vary due to influences by unmeasured parents’ characteristics such as child gender preference on resource allocation.

### Estimation strategy

The structure of LSMS dataset in Ethiopia is a hierarchical type where individual child information is nested within the parents’, and the household is, in turn, nested within the environmental shock that happened to the household. Our variable of interest, child nutrition outcome variable, is at an individual level, level 1; parents are at level 2; environmental shocks to the household^{Footnote 5} are at level 3. In other terms, child nutrition achievements are influenced both by child’s unobserved heterogeneity at an individual level, unobserved parents’ preference heterogeneity on resources allocation at household level, and household level environmental shocks hieratically. Our hierarchical model for panel dataset is known as repeated measures or growth-curve model (Gelman and Hill, 2007; Balov, 2016; StataCorp L. P, 2013). Dropping the panel time subscript for convenience, let us present a simple repeated measures model (or growth-curve model) which allows both intercept and slope-coefficient to vary as (Raudenbush and Bryk, 2002):

$$ {H}_{0 ps}={\beta}_{ips}+{\beta}_{1 ps}{g}_{ips}+{\varepsilon}_{ips} $$

(9)

where *i*, *p*, and *s* represent individual, parents at level 2, and shock variables at level 3 respectively. *H*_{ips}, *g*_{ips}, and *ε*_{ips} denote nutrition outcomes of individual, *i*, individual covariates at level 1, and idiosyncratic error respectively. *β*_{1ps} is the slope coefficient for variable *g*_{ips}, a level 1 covariate. We are assuming that the constant term, *β*_{ips}, randomly varies across units as a function of some level 2, *x*_{p} and level 3, *k*_{s} factors; these factors include household level and shock events variables. *ε*_{ips} is the idiosyncratic error term.

The model in (9) accounts for any possible heterogeneity associated with *p* and *s*. In what follows from Equation (10) to (13), we explain how the random variation of the constant term across units that exists due to the effect of some higher-level factors. The intercept and slope at level 1 model in (9) vary between children depending on factors in level 2 presented as below

$$ {\beta}_{ops}={\alpha}_{00s}+{\alpha}_{01}{x}_{1 ps}+{u}_{0 ps} $$

(10)

$$ {\beta}_{1 ps}={\alpha}_{10s} $$

(11)

In the same way, intercepts in the level 2 models, *β*_{ops} and *β*_{1ps}, vary between households according to the following level 3 models

$$ {\alpha}_{00s}={\gamma}_{000}+{\gamma}_{001}{k}_{1s}+\dots +{\gamma}_{005}{k}_{5s}+{\upsilon}_{0s} $$

(12)

$$ {\alpha}_{10s}={\gamma}_{100}+{\gamma}_{101}{k}_{1s}+\dots +{\gamma}_{105}{k}_{5s} $$

(13)

Substituting (12) and (13) into (10) and (11), respectively, and then substituting (10) and (11) in to (9) gives

$$ {H}_{ips}=\left({\gamma}_{000}+{\gamma}_{001}{k}_{1s}+\dots +{\gamma}_{005}{k}_{5s}+{\upsilon}_{0s}\right)+\left({\alpha}_{01}{x}_{1p\mathrm{s}}+{u}_{0 ps}\right)+\left({\gamma}_{100}+{\gamma}_{101}{k}_{1s}+\dots +{\gamma}_{105}{k}_{5s}\right){g}_{ips}+{\varepsilon}_{ips} $$

(14)

Rearranging the reduced-form model in (14), we get

$$ {H}_{ips}={\gamma}_{000}+{\gamma}_{001}{k}_{1s}+\dots +{\gamma}_{005}{k}_{5s}+{\gamma}_{100}{g}_{ips}+{\gamma}_{101}{k}_{1s}{g}_{ips}+\dots +{\gamma}_{105}{k}_{5s}{g}_{ips}+{\alpha}_{01}{x}_{1 ps}+{\upsilon}_{0s}+{u}_{0 ps}+{\varepsilon}_{ips} $$

(15)

where *υ*_{0s} is the random intercept at level 3 while *u*_{0ps} is the random intercept at level 2, and together with the idiosyncratic error at level 1, *ε*_{ips} estimates the random effect part of (15) while the remaining is the fixed-effect part of it. *γ*_{000} is the constant intercept and *γ*_{001}, *γ*_{001}, … , and *γ*_{005} represent slope coefficient that shows the correlation between the child nutrition outcome, *H*_{ips} and level 3 shocks, *k*_{s}. Similarly, *γ*_{101}, *γ*_{102}, … , and *γ*_{105} are slope coefficients for the interaction variables of the covariates at level 1, *g*_{ips} and level 3 shocks, *k*_{s} while *γ*_{100} is a slope which measures the relationship between the outcome variable and covariates at level 1. Here, we assume that *υ*_{0s},*u*_{0ps}, and *ε*_{ips} have 0 mean and constant variance.

Our hypothesis here is that parent’s resources allocation and an aggregate shock to the household observed at levels 2 and 3 respectively have a biased effect on a child’s nutritional outcome. Mixed-effect model for panel dataset technique is appropriate to estimate these hierarchical effects on a child’s nutrition outcomes (Steenbergen and Jones 2002; Diez-Roux 2000). The multilevel analysis in (15) therefore starts first by estimating the fixed-effect part, then the random-effect part. The fixed effects are estimated directly, and the coefficients are similar to the standard regression coefficients while the random effects are summarized according to their variances and covariance (Raudenbush and Bryk, 2002; Castellano et al. 2014; Rabe-Hesketh et al. 2000).

The motivation is to identify whether family resource allocations are biased against gender and to estimate family allocation behavior with the existence of household level shocks.

Shock variables can be considered as an intervention to the decision-making process and are treated at a level 3/the highest level in the regression. All the household level variables are included at level 2.

Gender is a variable included as an identity identifier in our regression to check if there is any cluster nutrition variation between boys and girls. The inter-class correlation coefficient tells about the correlation of the observations between clusters in a sense that if the inter-class correlation which is calculated using the standard deviation of constant and level 1 residual approaches 0, we should not use the grouping/clustering by gender at level 1 rather it is better to estimate simple regression; if the inter-class calculation result is close to 1, then there is no variance between a boy and a girl to explain at level 1; they are the same. If the calculated inter-class is between 0 and 1, this confirms that there is variance to explain due to the individual heterogeneity.