^{1}

^{1}

^{1}

We consider the estimation of nonparametric regression models with predictors being measured with a mixture of Berkson and classical errors. In practice, the Berkson error arises when the variable X of interest is unobservable and only a proxy of X can be measured while the inaccuracy related to the observation of the proxy causes an error of classical type. In this paper, we propose two nonparametric estimators of the regression function in the presence of either or both types of errors. We prove the asymptotic normality of our estimators and derive their rates of convergence. The finite-sample properties of the estimators are investigated through simulation studies.

Let

where

where

In many studies, it is however too costly or impossible to measure the predictor X exactly or directly. Instead, a proxy W of X is measured. For the violation of the direct observation assumption, [

where

The stochastic structure of Model (3) is fundamentally different from Model (2). Here, the measurement error of Model (2) is independent of X, but dependent on W. This distinctive feature leads to completely different procedures in estimation and inference for the models. In particular, nonparametric estimators that are consistent in Model (2) are no longer valid in Model (3), and vice versa. In most of the existing literature, the measurement error is supposed to be only one of the two types. In the Berkson model (3), it is usually assumed that the observable variable W is measured with perfect accuracy. However, this may not be true in some situations. In such cases, W is observed through

where

This paper is organised as follows. In Section 2, we propose estimators for the regression function curve

Let

Hence, if

Since

Noticing that, if

under the condition that

where

As a result, we propose the following estimator for

Example 1 Let the error densities

ratio

where

Using a kernel function

and an estimator for

where

Proceeding as above, we get an alternative estimator of

where

Therefore, when (6) is no longer valid, we propose the following estimator for

Remark 1 To ensure that the proposed estimator (9) is well-behaved, we need to make the following assumption.

Condition A:

1.

2.

Example 2 We use the same model as in Example 1 with

Remark 2

1. The above two nonparametric estimators of

2. When the variance of

where

3. When the variance of

In this section, we study asymptotic properties of the estimators proposed in Section 2. In particular, the properties of the estimator

In this section, we investigate the large-sample properties of the estimator

Condition B:

1.

2.

3.

4. The conditional moment

Let

Theorem 1 ((MSCE)) Suppose that Conditions A and B hold. Then, for each x such that

where

Explicit rates of convergence of the estimator

where

The second term on the right-hand side of Equation (11) describes the variance of

1. An exponential ratio of order

with

2. A polynomial ratio of order

with

In this section, we study the asymptotic behaviour of the MSE where

Theorem 2 Suppose that Conditions A and B hold and that the first half inequality of (12) is satisfied. Assume that

with

When

Theorem 3 Suppose Conditions A and B hold, and that

under the polynomial ratio (13), for each x such that

with

We obtain that, when

The theorem below establishes asymptotic normality in the exponential ratio case.

Theorem 4 Under the conditions of Theorem (2), and for bandwidth

where

The next theorem establishes asymptotic normality in the polynomial ratio case.

Theorem 5 Suppose that Conditions A and B hold and that the inequality of (13) is satisfied. Assume that

where

The proofs of all theorems are postponed to the Appendix.

When the error densities are unknown, they can be readily estimated from additional observations (e.g., a sample from the error densities, replicated data or external data) and these estimates can be substituted into (6) and (9) to produce the estimate of

We study numerical properties of the estimators proposed in Section 2. Note that we have defined two estimators, at (6) and (9). The first exists when

We apply the various estimators introduced above to some simulated examples (see, [

1.

2.

3.

where

In our simulations we consider sample sizes

For any nonparametric method for regression problem, the quality of the estimator also depends on the discrepancy of the observed sample. That is, for any given family of densities

(N, L) MISE | (N, N) MISE | (L, L) MISE | (L, N) MISE | ||||
---|---|---|---|---|---|---|---|

50 | 5.3524 | 8.3704 | 21.7584 | 9.9570 | |||

100 | 3.2803 | 6.8685 | 11.1636 | 6.7162 | |||

250 | 2.7013 | 5.4176 | 6.8579 | 4.9409 |

Finally, we compare

In this paper, we propose a new method for estimating non-parametric regression models with the predictors being measured with a mixture of Berkson and classical errors. The method is based on the relative smoothness of

Method | (0.1,0.4) MISE | (0.1,0.3) MISE | (0.2,0.3) MISE | (0.2,0.2) MISE | (0.3,0.2) MISE | (0.3,0.1) MISE | (0.4,0.1) MISE | |

(N, L) | 2.7013 | 3.2803 | 3.2877 | 3.0648 | 3.0751 | 3.1708 | 3.2467 | |

4.7107 | 4.3962 | 4.2197 | 4.0074 | 3.9953 | 4.2278 | 4.1772 | ||

4.2953 | 3.8815 | 3.5265 | 3.4723 | 3.2630 | 3.1153 | 2.8465 | ||

(N, N) | 5.4176 | 3.8075 | 4.0953 | 3.8031 | 3.8860 | 4.2107 | 5.1018 | |

5.7508 | 4.4523 | 4.3278 | 3.8031 | 4.5206 | 4.6225 | 5.5277 | ||

5.8240 | 4.1611 | 4.2753 | 3.5777 | 4.0363 | 4.3566 | 4.2559 | ||

(L, L) | 6.8579 | 5.6354 | 4.2114 | 3.3682 | 4.3915 | 3.9042 | 4.3463 | |

8.2793 | 5.5021 | 4.2403 | 3.3682 | 4.2050 | 4.2479 | 4.7129 | ||

7.7004 | 7.8699 | 5.8145 | 3.3493 | 4.3965 | 3.2047 | 3.8581 | ||

(L, N) | 4.9409 | 4.3785 | 4.3101 | 3.6858 | 3.7947 | 4.5531 | 4.1757 | |

5.3184 | 4.8508 | 5.3981 | 4.6511 | 4.3452 | 4.7562 | 4.8375 | ||

5.0408 | 4.4118 | 4.5309 | 3.9896 | 3.5704 | 3.3006 | 3.3726 |

Method | (0.1,0.4) MISE | (0.1,0.3) MISE | (0.2,0.3) MISE | (0.2,0.2) MISE | (0.3,0.2) MISE | (0.3,0.1) MISE | (0.4,0.1) MISE | |

(N, L) | 0.04895 | 0.04547 | 0.05037 | 0.05615 | 0.07006 | 0.06539 | 0.07410 | |

0.07716 | 0.06840 | 0.06457 | 0.06395 | 0.07383 | 0.07897 | 0.07455 | ||

0.05446 | 0.05133 | 0.05042 | 0.05125 | 0.06842 | 0.05885 | 0.07185 | ||

(N, N) | 0.06894 | 0.06061 | 0.08074 | 0.06868 | 0.07698 | 0.07855 | 0.08983 | |

0.09306 | 0.07728 | 0.09156 | 0.06868 | 0.08166 | 0.08486 | 0.09174 | ||

0.07558 | 0.06368 | 0.08162 | 0.06442 | 0.07558 | 0.05729 | 0.08035 | ||

(L, L) | 0.05102 | 0.04070 | 0.05352 | 0.05654 | 0.06965 | 0.06364 | 0.07761 | |

0.07427 | 0.06039 | 0.06891 | 0.05654 | 0.06962 | 0.07184 | 0.08422 | ||

0.05678 | 0.05349 | 0.05355 | 0.05094 | 0.06400 | 0.04008 | 0.04855 | ||

(L, N) | 0.07343 | 0.05983 | 0.07332 | 0.06923 | 0.07571 | 0.05997 | 0.06183 | |

0.09334 | 0.07516 | 0.08357 | 0.07148 | 0.07932 | 0.07314 | 0.08148 | ||

0.07820 | 0.06183 | 0.07485 | 0.05864 | 0.06491 | 0.04676 | 0.05368 |

smooth enough (relative to

This work was supported by Natural Science Foundation of Jiangxi Province of China under grant number 20142BAB211018.

Yin, Z.H., Liu, F. and Xie, Y.F. (2016) Nonparametric Regression Estimation with Mixed Measurement Errors. Applied Mathematics, 7, 2269-2284. http://dx.doi.org/10.4236/am.2016.717179

Let

and

where

Lemma 1 Suppose that

where, here, and below, C denotes a generic positive and finite constant.

Proof. It follows from (A2) of Condition A that

The conclusion follows from

The proof for the other result is similar and requires Parseval's Theorem.

From (14) and Lemma 1, we have

The proof of Theorem 2 follows from the expressions of

The proof of Theorem 3 is the same as the proof of Theorem 2, but in this case we need the following lemma.

Lemma 2 Suppose that

with

The proof of Lemma 2 is similar to the proof of Lemma 1 and is omitted.

Proofs of the Results of Section 3.1.2.A standard decomposition gives

is that the Lyapounov's condition holds, i.e., for some

Letting

Under the conditions given in the theorem 4, we can prove that

Under the conditions given in the theorem 5, we can prove that

The rest is standard and is omitted.