Home
Add Document
Sign In
Register
How can we detect overfitting?
Home
How can we detect overfitting?
independently w/ replacement from original training set. â¨. â evaluate on out-of-bag (oob) samples. â¨. â. â. â. Repeat F times. Final erro...
0 downloads
17 Views
265KB Size
DOWNLOAD
Recommend Documents
No documents
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉
Geoff Gordon—10-601 Machine Learning—Fall 2015
14
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉ remove hold-out group, fit on rest
Geoff Gordon—10-601 Machine Learning—Fall 2015
15
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○
⨉
○ ⨉
○ ⨉ remove hold-out group, fit on rest Geoff Gordon—10-601 Machine Learning—Fall 2015
16
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○
⨉
○ ⨉
○ ⨉ remove hold-out group, fit on rest Geoff Gordon—10-601 Machine Learning—Fall 2015
16
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○
⨉
○ ⨉
○ ⨉ add back in hold-out group, compute error Geoff Gordon—10-601 Machine Learning—Fall 2015
16
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉
add back in hold-out group, compute error Geoff Gordon—10-601 Machine Learning—Fall 2015
17
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉
add back in hold-out group, compute error Geoff Gordon—10-601 Machine Learning—Fall 2015
17
How can we detect overfitting? •Hold-out set (aka validation set) ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉
estimated error rate: 1/3 add back in hold-out group, compute error Geoff Gordon—10-601 Machine Learning—Fall 2015
17
How can we detect overfitting? •Cross-validation ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉ split data evenly into groups (“folds”)
Geoff Gordon—10-601 Machine Learning—Fall 2015
20
How can we detect overfitting? •Cross-validation ○
○
⨉
○
⨉ ○
⨉ ⨉ remove green group, fit on rest
Geoff Gordon—10-601 Machine Learning—Fall 2015
21
How can we detect overfitting? •Cross-validation ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉ add back green group: error 1/4
Geoff Gordon—10-601 Machine Learning—Fall 2015
22
How can we detect overfitting? •Cross-validation ⨉
○ ○
○ ⨉
○ ⨉
⨉ remove red group, fit on rest Geoff Gordon—10-601 Machine Learning—Fall 2015
23
How can we detect overfitting? •Cross-validation ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉ add back red group: error 2/4
Geoff Gordon—10-601 Machine Learning—Fall 2015
24
How can we detect overfitting? •Cross-validation ○
⨉ ○ ⨉
○
⨉ ○
⨉
remove blue group, fit on rest Geoff Gordon—10-601 Machine Learning—Fall 2015
25
How can we detect overfitting? •Cross-validation ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉ add back blue group: error 2/4
Geoff Gordon—10-601 Machine Learning—Fall 2015
26
How can we detect overfitting? •Cross-validation Overall: (1+2+2)/12 = 42% error rate ⨉ ○
○
○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉ add back blue group: error 2/4
Geoff Gordon—10-601 Machine Learning—Fall 2015
26
How can we detect overfitting? •Bootstrap ○
○
⨉ ○ ⨉
○
○ ⨉
⨉ ○
⨉ ⨉
make a bootstrap resample of our data Geoff Gordon—10-601 Machine Learning—Fall 2015
30
How can we detect overfitting? •Bootstrap ⨉
○ really on top of each other →
○○ ⨉
○○
⨉⨉ ⨉ ⨉ ⨉
size = N, each example drawn independently w/ replacement from original training set
make a bootstrap resample of our data Geoff Gordon—10-601 Machine Learning—Fall 2015
31
How can we detect overfitting? •Bootstrap ⨉
○ really on top of each other →
○○ ⨉
○○
⨉⨉ ⨉ ⨉ ⨉
size = N, each example drawn independently w/ replacement from original training set
fit our classifier on the new sample (often called a bag) Geoff Gordon—10-601 Machine Learning—Fall 2015
32
How can we detect overfitting? •Bootstrap ○
○ really on top of each other →
⨉ ○○ ⨉
○○
○ ⨉⨉ ⨉
⨉ ○
⨉ ⨉
size = N, each example drawn independently w/ replacement from original training set
evaluate on out-of-bag (oob) samples Geoff Gordon—10-601 Machine Learning—Fall 2015
33
How can we detect overfitting? •Bootstrap
Repeat F times Final error estimate = average error ○on oob samples⨉ ○
really on top of each other →
○○ ⨉
○○
○ ⨉⨉ ⨉
⨉ ○
⨉ ⨉
size = N, each example drawn independently w/ replacement from original training set
evaluate on out-of-bag (oob) samples Geoff Gordon—10-601 Machine Learning—Fall 2015
33
How can we detect overfitting? •Bootstrap
Repeat F times Final error estimate = average error ○on oob samples⨉ ○
really on top of each other →
○○ ⨉
○
Can treat fitted parameter ○○ ⨉⨉ vectors⨉as a sample from ⨉ ○ posterior distribution over ⨉ parameters (given data) ⨉
size = N, each example drawn independently w/ replacement from original training set
evaluate on out-of-bag (oob) samples Geoff Gordon—10-601 Machine Learning—Fall 2015
33
×
Sign In
Email
Password
Remember me
Forgot password?
Sign In
Login with Facebook
Login with Google