Intuitively, the first option seems pretty silly but how should we choose between the other three? Luckily, there are ways to characterize the relative strengths and weaknesses of these approaches. Prefer reasonable performance in small samples. Unreactive to misunderstandings a plus. This brings up one of the first interesting points about selecting an estimator: we may have different priorities depending on the situation and therefore the best estimator can be context dependent and subject to the judgement of a human.
If we have a large sample, we may not care about small sample properties. This refers to a specific type of convergence convergence in probability which is defined as:.
When something converges in probability, the sampling distribution becomes increasingly concentrated around the parameter as the sample size increases. Whoa whoa whoa. So what is the difference between unbiasedness and consistency? They kind of sound the same, how are they different? A quick example demonstrates this really well. Here are two possible estimators you could try:.
From this vantage point, it seems that consistency may be more important than unbiasedness if you have a big enough sample Figure 1. One differentiating feature even among consistent estimators can be how quickly they converge in probability.
You may have two estimators, estimator A and estimator B which are both consistent. But the rate at which they converge may be quite different. We say an estimator is asymptotically normal if, as the sample size goes to infinity, the distribution of the difference between the estimate and the true target parameter value is better and better described by the normal distribution.
In one of its most basic forms, the CLT describes the behavior of a sum and therefore a mean of independent and identically distributed iid random variables. Remember how you wanted small confidence intervals?
Asymptotic normality is the underpinning that allows you to use the standard closed-form formulas for confidence intervals at all. And in fact, asymptotic normality is dependent not just on the estimator but on the data generating process and the target parameter as well.
Some target parameters have no asymptotically normal estimators. This is true for parametric estimators, the nonparametric crew has other words for this but the overall idea is the same; let us not get dragged into that particular fray.
In practice, we are often concerned with relative efficiency, whether one estimator is more efficient i. The asymptotic normality is what allowed us to construct that symmetric interval, we got the 1. The estimate is the number we got from our estimator. But where does the standard error in that formula come from? We want the standard error to be small because that gives us tighter confidence intervals. But what we DO have control over is the choice of estimator and a good choice here can give a smaller overall standard error which gives us smaller confidence intervals.
Note: this is one of the most important points of this whole blog post. Suppose that we have a sample of data from a normal distribution and we want to estimate the mean of the distribution.
You can see in Plot 3 that at every sample size, the median is a less efficient estimator than the mean, i. The motivation for using a regression model to analyze an otherwise tidy, randomized experiment is variance reduction. If our outcome has many drivers, each one of those drivers is adding to the variation in the outcome.
When we include some of those drivers as covariates, they help absorb a portion of the overall variation in the outcome which can make it easier to see the impact of the treatment. We could choose whether to analyze the data with a difference in sample means approach or with a regression model that includes those two known pre-treatment covariates.
Figure 4 shows the estimates and confidence intervals from such simulated trials. Both methods are producing valid confidence intervals which are centered around the true underlying effect, but the confidence intervals for this particular simulation were more than 6x wider for the sample mean approach compared with the regression approach. In this toy example, we had an unrealistically easy data generating process to model. There was nothing complicated, non-linear, or interacting in the data generating process therefore the most obvious specification of the regression model was correct.
It only takes a minute to sign up. Connect and share knowledge within a single location that is structured and easy to search. Can I ask for a hint? Both estimators are unbiased thus, according to the Classical Statistical Theory, you can evaluate them comparing their variance. Sign up to join this community. The best answers are voted up and rise to the top.
Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Learn more. Which estimator is better? Ask Question. Asked 8 months ago.
Active 8 months ago. Viewed 80 times. Math Math 8 8 bronze badges. Add a comment. Active Oldest Votes. Unfortunately during the statistics course, I did not have this statement yet and I have to solve this problem in some other way, probably by calculation the expected losses using both estimators.
0コメント