0:00 [MUSIC PLAYING]
0:06 Welcome back.
0:07 We've covered a lot of ground already,
0:09 so today I want to review and reinforce concepts.
0:12 To do that, we'll explore two things.
0:14 First, we'll code up a basic pipeline
0:16 for supervised learning.
0:17 I'll show you how multiple classifiers
0:19 can solve the same problem.
0:21 Next, we'll build up a little more intuition
0:23 for what it means for an algorithm to learn something
0:25 from data, because that sounds kind of magical, but it's not.
0:29 To kick things off, let's look at a common experiment
0:31 you might want to do.
0:33 Imagine you're building a spam classifier.
0:35 That's just a function that labels an incoming email
0:37 as spam or not spam.
0:39 Now, say you've already collected a data set
0:41 and you're ready to train a model.
0:42 But before you put it into production,
0:44 there's a question you need to answer first--
0:46 how accurate will it be when you use it to classify emails that
0:49 weren't in your training data?
0:51 As best we can, we want to verify our models work well
0:54 before we deploy them.
0:56 And we can do an experiment to help us figure that out.
0:59 One approach is to partition our data set into two parts.
1:02 We'll call these Train and Test.
1:05 We'll use Train to train our model
1:07 and Test to see how accurate it is on new data.
1:10 That's a common pattern, so let's see how it looks in code.
1:13 To kick things off, let's import a data set into [? SyKit. ?]
1:17 We'll use Iris again, because it's handily included.
1:20 Now, we already saw Iris in episode two.
1:21 But what we haven't seen before is
1:23 that I'm calling the features x and the labels y.
1:26 Why is that?
1:28 Well, that's because one way to think of a classifier
1:30 is as a function.
1:32 At a high level, you can think of x as the input
1:34 and y as the output.
1:36 I'll talk more about that in the second half of this episode.
1:39 After we import the data set, the first thing we want to do
1:42 is partition it into Train and Test.
1:44 And to do that, we can import a handy utility,
1:46 and it makes the syntax clear.
1:48 We're taking our x's and our y's,
1:50 or our features and labels, and partitioning them
1:52 into two sets.
1:54 X_train and y_train are the features and labels
1:56 for the training set.
1:57 And X_test and y_test are the features and labels
2:00 for the testing set.
2:02 Here, I'm just saying that I want half the data to be
2:04 used for testing.
2:05 So if we have 150 examples in Iris, 75 will be in Train
2:09 and 75 will be in Test.
2:11 Now we'll create our classifier.
2:13 I'll use two different types here
2:14 to show you how they accomplish the same task.
2:17 Let's start with the decision tree we've already seen.
2:20 Note there's only two lines of code
2:22 that are classifier-specific.
2:25 Now let's train the classifier using our training data.
2:28 At this point, it's ready to be used to classify data.
2:31 And next, we'll call the predict method
2:33 and use it to classify our testing data.
2:35 If you print out the predictions,
2:37 you'll see there are a list of numbers.
2:38 These correspond to the type of Iris
2:40 the classifier predicts for each row in the testing data.
2:44 Now let's see how accurate our classifier
2:46 was on the testing set.
2:48 Recall that up top, we have the true labels for the testing
2:50 data.
2:51 To calculate our accuracy, we can
2:53 compare the predicted labels to the true labels,
2:55 and tally up the score.
2:57 There's a convenience method in [? Sykit ?]
2:59 we can import to do that.
3:00 Notice here, our accuracy was over 90%.
3:03 If you try this on your own, it might be a little bit different
3:06 because of some randomness in how the Train/Test
3:08 data is partitioned.
3:10 Now, here's something interesting.
3:11 By replacing these two lines, we can use a different classifier
3:14 to accomplish the same task.
3:16 Instead of using a decision tree,
3:18 we'll use one called [? KNearestNeighbors. ?]
3:20 If we run our experiment, we'll see that the code
3:23 works in exactly the same way.
3:25 The accuracy may be different when you run it,
3:27 because this classifier works a little bit differently
3:29 and because of the randomness in the Train/Test split.
3:32 Likewise, if we wanted to use a more sophisticated classifier,
3:35 we could just import it and change these two lines.
3:38 Otherwise, our code is the same.
3:40 The takeaway here is that while there are many different types
3:42 of classifiers, at a high level, they have a similar interface.
3:49 Now let's talk a little bit more about what
3:50 it means to learn from data.
3:53 Earlier, I said we called the features x and the labels y,
3:56 because they were the input and output of a function.
3:58 Now, of course, a function is something we already
4:00 know from programming.
4:02 def classify-- there's our function.
4:04 As we already know in supervised learning,
4:06 we don't want to write this ourselves.
4:09 We want an algorithm to learn it from training data.
4:12 So what does it mean to learn a function?
4:15 Well, a function is just a mapping from input
4:17 to output values.
4:18 Here's a function you might have seen before-- y
4:20 equals mx plus b.
4:22 That's the equation for a line, and there
4:24 are two parameters-- m, which gives the slope;
4:27 and b, which gives the y-intercept.
4:29 Given these parameters, of course,
4:31 we can plot the function for different values of x.
4:34 Now, in supervised learning, our classified function
4:36 might have some parameters as well,
4:38 but the input x are the features for an example we
4:41 want to classify, and the output y
4:43 is a label, like Spam or Not Spam, or a type of flower.
4:47 So what could the body of the function look like?
4:49 Well, that's the part we want to write algorithmically
4:51 or in other words, learn.
4:53 The important thing to understand here
4:55 is we're not starting from scratch
4:57 and pulling the body of the function out of thin air.
5:00 Instead, we start with a model.
5:01 And you can think of a model as the prototype for
5:04 or the rules that define the body of our function.
5:07 Typically, a model has parameters
5:08 that we can adjust with our training data.
5:10 And here's a high-level example of how this process works.
5:14 Let's look at a toy data set and think about what kind of model
5:17 we could use as a classifier.
5:19 Pretend we're interested in distinguishing
5:20 between red dots and green dots, some of which
5:23 I've drawn here on a graph.
5:25 To do that, we'll use just two features--
5:27 the x- and y-coordinates of a dot.
5:29 Now let's think about how we could classify this data.
5:32 We want a function that considers
5:34 a new dot it's never seen before,
5:35 and classifies it as red or green.
5:38 In fact, there might be a lot of data we want to classify.
5:40 Here, I've drawn our testing examples
5:42 in light green and light red.
5:44 These are dots that weren't in our training data.
5:47 The classifier has never seen them before, so how can
5:49 it predict the right label?
5:51 Well, imagine if we could somehow draw a line
5:53 across the data like this.
5:56 Then we could say the dots to the left
5:57 of the line are green and dots to the right of the line are
6:00 red.
6:00 And this line can serve as our classifier.
6:03 So how can we learn this line?
6:05 Well, one way is to use the training data to adjust
6:08 the parameters of a model.
6:09 And let's say the model we use is a simple straight line
6:12 like we saw before.
6:14 That means we have two parameters to adjust-- m and b.
6:17 And by changing them, we can change where the line appears.
6:21 So how could we learn the right parameters?
6:23 Well, one idea is that we can iteratively adjust
6:25 them using our training data.
6:27 For example, we might start with a random line
6:29 and use it to classify the first training example.
6:32 If it gets it right, we don't need to change our line,
6:35 so we move on to the next one.
6:36 But on the other hand, if it gets it wrong,
6:38 we could slightly adjust the parameters of our model
6:41 to make it more accurate.
6:43 The takeaway here is this.
6:44 One way to think of learning is using training data
6:47 to adjust the parameters of a model.
6:50 Now, here's something really special.
6:52 It's called tensorflow/playground.
6:55 This is a beautiful example of a neural network
6:57 you can run and experiment with right in your browser.
7:00 Now, this deserves its own episode for sure,
7:02 but for now, go ahead and play with it.
7:03 It's awesome.
7:04 The playground comes with different data
7:06 sets you can try out.
7:08 Some are very simple.
7:09 For example, we could use our line to classify this one.
7:12 Some data sets are much more complex.
7:15 This data set is especially hard.
7:17 And see if you can build a network to classify it.
7:20 Now, you can think of a neural network
7:21 as a more sophisticated type of classifier,
7:24 like a decision tree or a simple line.
7:26 But in principle, the idea is similar.
7:29 OK.
7:29 Hope that was helpful.
7:30 I just created a Twitter that you can follow
7:32 to be notified of new episodes.
7:33 And the next one should be out in a couple of weeks,
7:36 depending on how much work I'm doing for Google I/O. Thanks,
7:38 as always, for watching, and I'll see you next time.
Transcripción : Youtube
0:06 Welcome back.
0:07 We've covered a lot of ground already,
0:09 so today I want to review and reinforce concepts.
0:12 To do that, we'll explore two things.
0:14 First, we'll code up a basic pipeline
0:16 for supervised learning.
0:17 I'll show you how multiple classifiers
0:19 can solve the same problem.
0:21 Next, we'll build up a little more intuition
0:23 for what it means for an algorithm to learn something
0:25 from data, because that sounds kind of magical, but it's not.
0:29 To kick things off, let's look at a common experiment
0:31 you might want to do.
0:33 Imagine you're building a spam classifier.
0:35 That's just a function that labels an incoming email
0:37 as spam or not spam.
0:39 Now, say you've already collected a data set
0:41 and you're ready to train a model.
0:42 But before you put it into production,
0:44 there's a question you need to answer first--
0:46 how accurate will it be when you use it to classify emails that
0:49 weren't in your training data?
0:51 As best we can, we want to verify our models work well
0:54 before we deploy them.
0:56 And we can do an experiment to help us figure that out.
0:59 One approach is to partition our data set into two parts.
1:02 We'll call these Train and Test.
1:05 We'll use Train to train our model
1:07 and Test to see how accurate it is on new data.
1:10 That's a common pattern, so let's see how it looks in code.
1:13 To kick things off, let's import a data set into [? SyKit. ?]
1:17 We'll use Iris again, because it's handily included.
1:20 Now, we already saw Iris in episode two.
1:21 But what we haven't seen before is
1:23 that I'm calling the features x and the labels y.
1:26 Why is that?
1:28 Well, that's because one way to think of a classifier
1:30 is as a function.
1:32 At a high level, you can think of x as the input
1:34 and y as the output.
1:36 I'll talk more about that in the second half of this episode.
1:39 After we import the data set, the first thing we want to do
1:42 is partition it into Train and Test.
1:44 And to do that, we can import a handy utility,
1:46 and it makes the syntax clear.
1:48 We're taking our x's and our y's,
1:50 or our features and labels, and partitioning them
1:52 into two sets.
1:54 X_train and y_train are the features and labels
1:56 for the training set.
1:57 And X_test and y_test are the features and labels
2:00 for the testing set.
2:02 Here, I'm just saying that I want half the data to be
2:04 used for testing.
2:05 So if we have 150 examples in Iris, 75 will be in Train
2:09 and 75 will be in Test.
2:11 Now we'll create our classifier.
2:13 I'll use two different types here
2:14 to show you how they accomplish the same task.
2:17 Let's start with the decision tree we've already seen.
2:20 Note there's only two lines of code
2:22 that are classifier-specific.
2:25 Now let's train the classifier using our training data.
2:28 At this point, it's ready to be used to classify data.
2:31 And next, we'll call the predict method
2:33 and use it to classify our testing data.
2:35 If you print out the predictions,
2:37 you'll see there are a list of numbers.
2:38 These correspond to the type of Iris
2:40 the classifier predicts for each row in the testing data.
2:44 Now let's see how accurate our classifier
2:46 was on the testing set.
2:48 Recall that up top, we have the true labels for the testing
2:50 data.
2:51 To calculate our accuracy, we can
2:53 compare the predicted labels to the true labels,
2:55 and tally up the score.
2:57 There's a convenience method in [? Sykit ?]
2:59 we can import to do that.
3:00 Notice here, our accuracy was over 90%.
3:03 If you try this on your own, it might be a little bit different
3:06 because of some randomness in how the Train/Test
3:08 data is partitioned.
3:10 Now, here's something interesting.
3:11 By replacing these two lines, we can use a different classifier
3:14 to accomplish the same task.
3:16 Instead of using a decision tree,
3:18 we'll use one called [? KNearestNeighbors. ?]
3:20 If we run our experiment, we'll see that the code
3:23 works in exactly the same way.
3:25 The accuracy may be different when you run it,
3:27 because this classifier works a little bit differently
3:29 and because of the randomness in the Train/Test split.
3:32 Likewise, if we wanted to use a more sophisticated classifier,
3:35 we could just import it and change these two lines.
3:38 Otherwise, our code is the same.
3:40 The takeaway here is that while there are many different types
3:42 of classifiers, at a high level, they have a similar interface.
3:49 Now let's talk a little bit more about what
3:50 it means to learn from data.
3:53 Earlier, I said we called the features x and the labels y,
3:56 because they were the input and output of a function.
3:58 Now, of course, a function is something we already
4:00 know from programming.
4:02 def classify-- there's our function.
4:04 As we already know in supervised learning,
4:06 we don't want to write this ourselves.
4:09 We want an algorithm to learn it from training data.
4:12 So what does it mean to learn a function?
4:15 Well, a function is just a mapping from input
4:17 to output values.
4:18 Here's a function you might have seen before-- y
4:20 equals mx plus b.
4:22 That's the equation for a line, and there
4:24 are two parameters-- m, which gives the slope;
4:27 and b, which gives the y-intercept.
4:29 Given these parameters, of course,
4:31 we can plot the function for different values of x.
4:34 Now, in supervised learning, our classified function
4:36 might have some parameters as well,
4:38 but the input x are the features for an example we
4:41 want to classify, and the output y
4:43 is a label, like Spam or Not Spam, or a type of flower.
4:47 So what could the body of the function look like?
4:49 Well, that's the part we want to write algorithmically
4:51 or in other words, learn.
4:53 The important thing to understand here
4:55 is we're not starting from scratch
4:57 and pulling the body of the function out of thin air.
5:00 Instead, we start with a model.
5:01 And you can think of a model as the prototype for
5:04 or the rules that define the body of our function.
5:07 Typically, a model has parameters
5:08 that we can adjust with our training data.
5:10 And here's a high-level example of how this process works.
5:14 Let's look at a toy data set and think about what kind of model
5:17 we could use as a classifier.
5:19 Pretend we're interested in distinguishing
5:20 between red dots and green dots, some of which
5:23 I've drawn here on a graph.
5:25 To do that, we'll use just two features--
5:27 the x- and y-coordinates of a dot.
5:29 Now let's think about how we could classify this data.
5:32 We want a function that considers
5:34 a new dot it's never seen before,
5:35 and classifies it as red or green.
5:38 In fact, there might be a lot of data we want to classify.
5:40 Here, I've drawn our testing examples
5:42 in light green and light red.
5:44 These are dots that weren't in our training data.
5:47 The classifier has never seen them before, so how can
5:49 it predict the right label?
5:51 Well, imagine if we could somehow draw a line
5:53 across the data like this.
5:56 Then we could say the dots to the left
5:57 of the line are green and dots to the right of the line are
6:00 red.
6:00 And this line can serve as our classifier.
6:03 So how can we learn this line?
6:05 Well, one way is to use the training data to adjust
6:08 the parameters of a model.
6:09 And let's say the model we use is a simple straight line
6:12 like we saw before.
6:14 That means we have two parameters to adjust-- m and b.
6:17 And by changing them, we can change where the line appears.
6:21 So how could we learn the right parameters?
6:23 Well, one idea is that we can iteratively adjust
6:25 them using our training data.
6:27 For example, we might start with a random line
6:29 and use it to classify the first training example.
6:32 If it gets it right, we don't need to change our line,
6:35 so we move on to the next one.
6:36 But on the other hand, if it gets it wrong,
6:38 we could slightly adjust the parameters of our model
6:41 to make it more accurate.
6:43 The takeaway here is this.
6:44 One way to think of learning is using training data
6:47 to adjust the parameters of a model.
6:50 Now, here's something really special.
6:52 It's called tensorflow/playground.
6:55 This is a beautiful example of a neural network
6:57 you can run and experiment with right in your browser.
7:00 Now, this deserves its own episode for sure,
7:02 but for now, go ahead and play with it.
7:03 It's awesome.
7:04 The playground comes with different data
7:06 sets you can try out.
7:08 Some are very simple.
7:09 For example, we could use our line to classify this one.
7:12 Some data sets are much more complex.
7:15 This data set is especially hard.
7:17 And see if you can build a network to classify it.
7:20 Now, you can think of a neural network
7:21 as a more sophisticated type of classifier,
7:24 like a decision tree or a simple line.
7:26 But in principle, the idea is similar.
7:29 OK.
7:29 Hope that was helpful.
7:30 I just created a Twitter that you can follow
7:32 to be notified of new episodes.
7:33 And the next one should be out in a couple of weeks,
7:36 depending on how much work I'm doing for Google I/O. Thanks,
7:38 as always, for watching, and I'll see you next time.
Transcripción : Youtube
No hay comentarios.:
Publicar un comentario