Definition
Consider a random sample where are iid. rvs. from a distribution with a probability density function , . The joint pdf of is .
Definition
Likelihood function is defined by
and can be interpreted as probability that observed data can occur.
Our aim here is to maximize that likelihood function.
Definition
For a given observations a value at which is a maximum is called a maximum likelihood estimate for .
That is is the value of that satisfies
To find such one should:
- First solve for
- Then check maximum by .
In most cases differentiating the is hard to do so. Therefore is used instead. Since is strictly increasing when , maximizing it will also maximize the .
Multiple Parameters
If is a vector to be estimated, then solve
solve equations for estimations.
Invariance property
If is MLE for and is a function of , then is MLE for .
MLE at the boundary of
In such cases MLE exists but can not be obtained as a solution to the derivative.
Example
Take . What is the MLE of ?
there is no finite solution for .
But observe that implying that minimizing would maximize the likelihood. But one should consider that . So choosing the minimum value that covers all the values in would ensure the maximum likelihood. Then,
Advantages-disadvantages
Advantages
- It makes sense.
- Widely used
- Can also be used where observed values are not independent or iid.
- Gives good measures for large sample sizes.
Disadvantages
- must be known
- MLE might not exist or may not be unique
- Numerical methods might be needed.