Intelligent Fusion Technology, Germantown, Inc., MD 20876, USA

Department of Electrical Engineering, Columbia University, New York, NY 10027, USA

Abstract

Parameter estimation in dynamic systems finds applications in various disciplines, including system biology. The well-known expectation-maximization (EM) algorithm is a popular method and has been widely used to solve system identification and parameter estimation problems. However, the conventional EM algorithm cannot exploit the sparsity. On the other hand, in gene regulatory network inference problems, the parameters to be estimated often exhibit sparse structure. In this paper, a regularized expectation-maximization (rEM) algorithm for sparse parameter estimation in nonlinear dynamic systems is proposed that is based on the maximum

1 Introduction

The dynamic system is a widely used modeling tool that finds applications in many engineering disciplines. Techniques for state estimation in dynamic systems have been well established. Recently, the problem of

The expectation-maximization (EM) algorithm has also been applied to solve the sparse state estimate problem in dynamic systems

In this paper, we focus on the _{1} minimization problem for which a re-weighted iterative thresholding algorithm is employed. To illustrate the proposed sparse parameter estimation method in dynamic systems, we consider the gene-regulatory network inference based on gene expression data.

The unscented Kalman filter has been used in the inference of gene regulatory network _{1} optimization algorithm. To the best knowledge of the authors, the proposed rEM with the reweighted iterative threshold optimization algorithm is innovative. Furthermore, we have systematically investigated the performance of the proposed algorithm and compared the results with other conventional algorithms.

The remainder of this paper is organized as follows. In Section 2, the problem of the sparse parameter estimation in dynamic systems is introduced and the regularized EM algorithm is formulated. In Section 3, the E-step of rEM that involves forward-backward recursions and Gaussian approximations is discussed. Section 4 discusses the _{1} optimization problem involved in the maximization step. Application of the proposed rEM algorithm to gene regulatory network inference is discussed in Section 5. Concluding remarks are given in Section 6.

2 Problem statement and the MAP-EM algorithm

We consider a general discrete-time nonlinear dynamic system with unknown parameters, given by the following state and measurement equations:

where **
x
**

Define the notation **
θ
** from the length-

In the EM algorithm and the MAP-EM algorithm **
θ
**

and

respectively.

Note that the regularized EM can be viewed as a special MAP-EM. To differentiate the sparsity-enforced EM algorithm from the general MAP-EM algorithm, rEM is used. In this paper, the following assumptions are made. (1) The probability density function of the state is assumed to be Gaussian. The Bayesian filter is optimal; however, exact finite-dimensional solutions do not exist. Hence, numerical approximation has to be made. The Gaussian approximation is frequently assumed due to the relatively low complexity and high accuracy

We next consider the expression of the

Therefore,

where **
x
**

where **
λ
**=[

Note that in many applications, the unknown parameters **
θ
** are only related to the state equation, but not to the measurement equation. Therefore, the second term in (8) can be removed. In the next section, we discuss the procedures for computing the densities

3 The E-step: computing the

We first discuss the calculation of the probability density functions of the states **
x
**

3.1 Forward recursion

The forward recursion is composed of two steps: prediction and filtering. Specifically, given the prior probability density function (PDF) **
x
**

It then follows that the predictive PDF is Gaussian, i.e.,

where **
P
**

Moreover, the filtered PDF is also Gaussian, i.e.,

with

3.2 Backward recursion

In the backward recursion, we compute the smoothed PDFs **
x
**

Due to the Markov property of the state-space model, we have **
x
**

Now, assume that

It then follows from (17) and (19) that

where

3.3 Approximating the integrals

The integrals associated with the expectations in the forward-backward recursions for computing the approximate state PDFs, i.e., (9), (10), (13), (15), (16), and (18), as well as the integrals involved in computing the function **
θ
**,

then the general Gaussian-type integral can be approximated by

where **
P
**=

By using different point sets, different Gaussian approximation filters and smoothers can be obtained, such as the Gauss-Hermite quadrature (GHQ) **
g
**(

For the numerical results in this paper, the UT is used in the Gaussian approximation filter and smoother, where we have **
x
**

and

where **
e
**

4 The M-step: solving the **
ℓ
**

Solving the _{1} optimization problems in (8) is not trivial since |_{
i
}| is nondifferentiable at _{
i
}=0. The _{1} optimization is a useful tool to obtain sparse solutions. Methods for solving linear inverse problems with sparse constraints are reviewed in

4.1 Iterative thresholding algorithm

Denote

The solution to (29) can be iteratively obtained by solving a sequence of optimization problems **
θ
**

where _{
t
} is such that _{
t
}**
I
** mimics the Hessian

where **
z
** denotes the variable to be optimized in the objective function.

The equivalent form of (31) is given by

Note that Equation 34 is derived as follows. Because we require that _{
t
}**
I
** mimics the Hessian

The solution to (32) is given by

is the soft thresholding function with sign(**
u
**

Finally, the iterative procedure for solving (29) is given by

And the iteration stops when the following condition is met:

where

4.2 Adaptive selection of **
λ
**

So far, the parameters _{
i
} in the Laplace prior are fixed. Here, we propose to adaptively tune them based on the output of the iterative thresholding algorithm. The algorithm consists of solving a sequence of weighted _{1}-minimization problems. _{
i
} used for the next iteration are computed from the value of the current solution. A good choice of _{
i
} is to make them counteract the influence of the magnitude of the _{1} penalty function _{
i
}=1,∀**
θ
**. Next, we update

5 Application to gene regulatory network inference

The gene regulatory network can be described by a graph in which genes are viewed as nodes and edges depict causal relations between genes. By analyzing collected gene expression levels over a period of time, one can find some regulatory relations between different genes. Under the discrete-time state-space modeling, for a gene regulatory network with **
x
**

In this case, the nonlinear function **
f
**(

with

and

In (41), **
A
** is an

For the measurement model, we have

5.1 Inference of gene regulatory network with four genes

In the simulations, we consider a network with four genes. The true gene regulatory coefficients matrix is given by

To compare the EM algorithm with the proposed rEM algorithm, the simulation was conduced ten times. In each time, the initial value of **
A
**(

As a performance metrics, the receiver operating characteristic (ROC) curve is frequently used. However, for this specific example, with the increasing of the false-positive rate, the true-positive rate given by rEM and EM is always high (close to 1) which makes the distinguishment of the performance of rEM algorithm and EM algorithm difficult. Hence, the root mean-squared error (RMSE) and the sparsity factor (SF) are used in this section. The RMSE is defined by

where **
A
**. The SF is given by

where _{0} and

In addition, to test the effectiveness of the proposed method at finding the support of the unknown parameters, the number of matched elements is used and can be obtained by the following procedures: (1) Compute the support of **
A
** using the true parameters (denoted by

5.1.1 The effect of different

The performance of rEM using different _{
w
}. The RMSE and SF are shown in Figures _{
w
} provides the smallest RMSE and sparsest parameter estimation. The number of matched elements of test algorithms with different _{
w
} provides more matched elements than the EM algorithm.

RMSE of rEM with different _{w}

**RMSE of rEM with different **
**
λ
**

SF of rEM with different _{w}

**SF of rEM with different **
**
λ
**

The number of matched elements of rEM with different _{w}

**The number of matched elements of rEM with different **
**
λ
**

5.1.2 The effect of noise

Two different cases are tested. In the first case, the covariance of the process noise and measurement noise are chosen to be 0.01. In the second case, they are chosen to be 0.1. The performance of two test cases is shown in Figures _{
w
} with **
U
**,

RMSE of rEM_{w} with different noise levels

**RMSE of rEM**
_{
w
}
**with different noise levels.**

SF of rEM_{w} with different noise levels

**SF of rEM**
_{
w
}
**with different noise levels.**

The number of matched elements of rEM_{w} with different noise levels

**The number of matched elements of rEM**
_{
w
}
**with different noise levels.**

5.1.3 The effect of the number of observations

In order to test the effect of the number of observations, the rEM_{
w
} algorithm with 10 and 20 observations are tested. The simulation results are shown in Figures _{
w
} with more observations gives less RMSE. In addition, as shown in Figure _{
w
} with more observations gives slightly better result for finding the support of the unknown parameters.

RMSE of rEM_{w} with different lengths of observations

**RMSE of rEM**
_{
w
}
**with different lengths of observations.**

SF of rEM_{w} with different lengths of observations

**SF of rEM**
_{
w
}
**with different lengths of observations.**

The number of matched elements of rEM_{w} with different lengths of observations

**The number of matched elements of rEM**
_{
w
}
**with different lengths of observations.**

5.1.4 The effect of **
κ
**

In order to test the performance of _{
w
} algorithm with different _{
w
} with different _{
w
} using _{
w
} using _{
w
} using _{
w
} using _{
w
} using _{
w
} using _{
w
} using

RMSE of rEM_{w} with different

**RMSE of rEM**
_{
w
}
**with different **
**
κ
**

SR of rEM_{w} with different

**SR of rEM**
_{
w
}
**with different **
**
κ
**

The number of matched elements of rEM_{w} with different

**The number of matched elements of rEM**
_{
w
}
**with different **
**
κ
**

5.1.5 Effect of sparsity level

The performance comparison of the rEM_{
w
} and the conventional EM with different sparsity levels of **
A
** is shown in Figures

RMSE of the EM and rEM_{w} for normal and denser

**RMSE of the EM and rEM**
_{
w
}
**for normal and denser **
**
A
**

SF of the EM and rEM_{w} for normal and denser

**SF of the EM and rEM**
_{
w
}
**for normal and denser **
**
A
**

The number of matched elements of the EM and rEM_{w} for normal and denser

**The number of matched elements of the EM and rEM**
_{
w
}
**for normal and denser **
**
A
**

Note that ‘(Denser)’ is used to denote the result using **
A
** shown in Equation 48. It can be seen that the RMSE of rEM

5.1.6 Comparison with **
ℓ
**

We compare the proposed rEM algorithm and the _{1} optimization-based method, as well as the conventional EM algorithm. The _{1} optimization is a popular approach to obtain the sparse solution. For the problem under consideration, it obtains an estimate of **
θ
** by solving the following optimization problem:

where

We also compare the _{1} optimization method with the proposed rEM_{w} algorithm, and the results are shown in Figures _{1} optimization method. The RMSE does not decrease monotonously with the decreasing of the parameter _{1} optimization method with _{1} optimization with _{1} optimization with _{1} optimization algorithm is also used in the simulation. However, all _{1} optimization-based methods cannot achieve better performance than that of using the rEM_{
w
}.

RMSE of the _{1} optimization with different ** λ**, reweighted

**RMSE of the **
**
ℓ
**

SF of the _{1} optimization with different ** λ**, reweighted

**SF of the **
**
ℓ
**

The number of matched elements of the _{1} optimization with different ** λ**, reweighted

**The number of matched elements of the **
**
ℓ
**

5.1.7 Comparison with BPDN-DF

To solve the problem using BPDN-DF, the model in (41) and (44) are modified as

and

respectively. Note that

where **
λ
**=[

RMSE of BPDN-DF and rEM_{w}

**RMSE of BPDN-DF and rEM**
_{
w
}
**.**

SF of BPDN-DF and rEM_{w}

**SF of BPDN-DF and rEM**
_{
w
}
**.**

The number of matched elements of BPDN-DF and rEM_{w}

**The number of matched elements of BPDN-DF and rEM**
_{
w
}
**.**

5.2 Inference of gene regulatory network with eight genes

In this section, we test the proposed algorithm using a larger gene regulatory network which includes eight genes; the performances of the EM, the rEM, the rEM_{
w
}, the _{1} optimization method, and BPDN-DF are given. Forty data points are collected to infer the structure of the network. The system noise and measurement noise are assumed to be Gaussian-distributed with means **
0
** and covariances

For testing, each coefficient in

The metric used to evaluate the inferred GRN is the ROC curve, in which the true-positive rate (TPR) and the false-positive rate (FPR) are involved. They are given by

where the number of true positives (TP

The ROC curves of the EM, the rEM, and the rEM_{
w
} are compared in Figure _{
w
} performs better than the rEM and the convectional EM algorithms.

ROCs of the EM, the rEM, and the rEM_{w}

**ROCs of the EM, the rEM, and the rEM**
_{
w
}
**.**

In addition, the sparse solution is obtained by using rEM and rEM_{
w
} while it cannot be obtained by using the EM algorithm. The sparsity factor of rEM and rEM_{
w
} is shown in Figure _{
w
} is closer to the ground truth than that given by the EM algorithm.

Sparsity factor of the rEM and the rEM_{w}

**Sparsity factor of the rEM and the rEM**
_{
w
}
**.**

In Figure _{
w
}, _{1} optimization method, and BPDN-DF are compared. Similarly, the _{1} optimization method with different _{
w
} performs much better than the _{1} optimization method and BPDN-DF algorithm. Hence, the sparsity factor of _{1} optimization method and BPDN-DF is not shown.

ROCs of the rEM_{w}, _{1} optimization method, and BPDN-DF

**ROCs of the rEM**
_{
w
}
**, **
**
ℓ
**

5.3 Inference of gene regulatory network from malaria expression data

The dataset with the first six gene expression data of malaria is given in reference **
P
**

The inferred **
A
** by the EM algorithm is

The inferred **
A
** by the rEM with

The inferred **
A
** by the rEM

The state estimation provided by the UKF based on the model using the inferred parameters of the EM, the rEM, the rEM_{
w
}, and the true gene expression is shown in Figure _{
w
} is close to the true gene expression data. In addition, The rEM_{
w
} algorithm provide sparser solution than the rEM algorithm. Both the rEM and the rEM_{
w
} algorithms give sparser solutions than the EM algorithm which validates the effectiveness of the proposed method.

True malaria gene expression and estimated gene expression by different algorithms

**True malaria gene expression and estimated gene expression by different algorithms.**

6 Conclusions

In this paper, we have considered the problem of sparse parameter estimation in a general nonlinear dynamic system, and we have proposed an approximate MAP-EM solution, called the rEM algorithm. The expectation step involves the forward Gaussian approximation filtering and the backward Gaussian approximation smoothing. The maximization step employs a re-weighted iterative thresholding method. We have provided examples of the inference of gene regulatory network based on expression data. Comparisons with the traditional EM algorithm as well as with the existing approach to solving sparse problems such as the _{1} optimization and the BPDN-DF show that the proposed rEM algorithm provides both more accurate estimation result and sparser solutions.

Competing interests

The authors declare that they have no competing interests.