Note-Code: Videos: Youtube : What Makes a Good Feature?

Youtube : What Makes a Good Feature? - Machine Learning Recipes #3

Transcripción

0:06JOSH GORDON: Classifiers are only

0:08as good as the features you provide.

0:10That means coming up with good features

0:12is one of your most important jobs in machine learning.

0:14But what makes a good feature, and how can you tell?

0:17If you're doing binary classification,

0:19then a good feature makes it easy to decide

0:21between two different things.

0:23For example, imagine we wanted to write a classifier

0:26to tell the difference between two types of dogs--

0:29greyhounds and Labradors.

0:30Here we'll use two features-- the dog's height in inches

0:34and their eye color.

0:35Just for this toy example, let's make a couple assumptions

0:38about dogs to keep things simple.

0:40First, we'll say that greyhounds are usually

0:43taller than Labradors.

0:44Next, we'll pretend that dogs have only two eye

0:47colors-- blue and brown.

0:48And we'll say the color of their eyes

0:50doesn't depend on the breed of dog.

0:53This means that one of these features is useful

0:55and the other tells us nothing.

0:57To understand why, we'll visualize them using a toy

1:01dataset I'll create.

1:02Let's begin with height.

1:04How useful do you think this feature is?

1:06Well, on average, greyhounds tend

1:08to be a couple inches taller than Labradors, but not always.

1:11There's a lot of variation in the world.

1:13So when we think of a feature, we

1:15have to consider how it looks for different values

1:17in a population.

1:19Let's head into Python for a programmatic example.

1:22I'm creating a population of 1,000

1:24dogs-- 50-50 greyhound Labrador.

1:27I'll give each of them a height.

1:29For this example, we'll say that greyhounds

1:31are on average 28 inches tall and Labradors are 24.

1:35Now, all dogs are a bit different.

1:37Let's say that height is normally distributed,

1:39so we'll make both of these plus or minus 4 inches.

1:42This will give us two arrays of numbers,

1:44and we can visualize them in a histogram.

1:47I'll add a parameter so greyhounds are in red

1:49and Labradors are in blue.

1:51Now we can run our script.

1:53This shows how many dogs in our population have a given height.

1:57There's a lot of data on the screen,

1:58so let's simplify it and look at it piece by piece.

2:03We'll start with dogs on the far left

2:05of the distribution-- say, who are about 20 inches tall.

2:08Imagine I asked you to predict whether a dog with his height

2:11was a lab or a greyhound.

2:13What would you do?

2:14Well, you could figure out the probability of each type

2:16of dog given their height.

2:18Here, it's more likely the dog is a lab.

2:20On the other hand, if we go all the way

2:22to the right of the histogram and look

2:24at a dog who is 35 inches tall, we

2:26can be pretty confident they're a greyhound.

2:29Now, what about a dog in the middle?

2:31You can see the graph gives us less information

2:33here, because the probability of each type of dog is close.

2:36So height is a useful feature, but it's not perfect.

2:40That's why in machine learning, you almost always

2:42need multiple features.

2:43Otherwise, you could just write an if statement

2:45instead of bothering with the classifier.

2:47To figure out what types of features you should use,

2:50do a thought experiment.

2:52Pretend you're the classifier.

2:53If you were trying to figure out if this dog is

2:55a lab or a greyhound, what other things would you want to know?

3:00You might ask about their hair length,

3:01or how fast they can run, or how much they weigh.

3:04Exactly how many features you should use

3:06is more of an art than a science,

3:08but as a rule of thumb, think about how many you'd

3:10need to solve the problem.

3:12Now let's look at another feature like eye color.

3:15Just for this toy example, let's imagine

3:17dogs have only two eye colors, blue and brown.

3:20And let's say the color of their eyes

3:22doesn't depend on the breed of dog.

3:24Here's what a histogram might look like for this example.

3:28For most values, the distribution is about 50/50.

3:32So this feature tells us nothing,

3:33because it doesn't correlate with the type of dog.

3:36Including a useless feature like this in your training

3:39data can hurt your classifier's accuracy.

3:41That's because there's a chance they might appear useful purely

3:45by accident, especially if you have only a small amount

3:48of training data.

3:50You also want your features to be independent.

3:52And independent features give you

3:54different types of information.

3:56Imagine we already have a feature-- height and inches--

3:59in our dataset.

4:00Ask yourself, would it be helpful

4:02if we added another feature, like height in centimeters?

4:05No, because it's perfectly correlated with one

4:08we already have.

4:09It's good practice to remove highly correlated features

4:12from your training data.

4:14That's because a lot of classifiers

4:15aren't smart enough to realize that height in inches

4:18in centimeters are the same thing,

4:20so they might double count how important this feature is.

4:23Last, you want your features to be easy to understand.

4:26For a new example, imagine you want

4:28to predict how many days it will take

4:30to mail a letter between two different cities.

4:33The farther apart the cities are, the longer it will take.

4:37A great feature to use would be the distance

4:39between the cities in miles.

4:42A much worse pair of features to use

4:44would be the city's locations given by their latitude

4:47and longitude.

4:48And here's why.

4:48I can look at the distance and make

4:51a good guess of how long it will take the letter to arrive.

4:54But learning the relationship between latitude, longitude,

4:56and time is much harder and would require many more

5:00examples in your training data.

5:01Now, there are techniques you can

5:03use to figure out exactly how useful your features are,

5:05and even what combinations of them are best,

5:08so you never have to leave it to chance.

5:11We'll get to those in a future episode.

5:13Coming up next time, we'll continue building our intuition

5:16for supervised learning.

5:17We'll show how different types of classifiers

5:19can be used to solve the same problem and dive a little bit

5:22deeper into how they work.

5:24Thanks very much for watching, and I'll see you then.

Note-Code: Videos

Youtube : What Makes a Good Feature? - Machine Learning Recipes #3

No hay comentarios.:

Publicar un comentario

Entradas populares 00

Entradas populares 30