Thursday, 10 November 2011

Dropping Stanford’s Online AI Class

I have decided to stop following Stanford’s online AI course to focus on the sibling Machine Learning and Database classes. Simply put, I feel that I’m learning a great deal more from the ML and DB classes, and the AI class isn’t really giving me a sufficient return on investment for my time. I’d like to jot down a few thoughts as to why this is the case.

Now, it may seem churlish to criticise a course being put out for free, so let me first clarify that I do heartily appreciate the generous efforts of Sebastian Thrun and Peter Norvig in running this; it must take an inordinate amount of time and effort to do so. I’m also convinced that online courses like this are a tremendously important future direction for education. Even so, I believe that it’s appropriate to evaluate and give constructive criticism -- much like reviewing a piece of open source software built by volunteers, for example.

I think the biggest weakness of the AI course is the lack of any practical assignments. Both the ML and DB classes have hands-on assignments for the week’s topics, and it really helps to understand and motivate the material. The AI class lacks any equivalent; indeed, for some of the weeks, only half the topics appeared even on the homework problems. Clearly, if I were sufficiently motivated, I could go and implement these things for myself, but part of the purpose of a course is as external motivation. After a hard day at work, it’s often only the threat of a deadline that will get me to study ;-)

The homeworks that have been set have been, in my opinion, rather low quality, with lots of ambiguity or errors requiring clarification after the fact...or even giving the answer to one question in the next! While I’ve scored well on them, I don’t necessarily feel it corresponds to any deep understanding of the material. It’s quite possible to answer many homework questions just by mimicking the mechanics of a lecture video by rote without any deep comprehension of what’s going on.

The lectures themselves are divided up into lots of short videos with quizzes at the end. Unfortunately, I found the constant quizzes to be simply annoying. I understand that asking questions to provoke the listener to think through a topic is good pedagogic technique, but it can definitely be overused. Rather than prompting me to think through a question, my response often ended up being “how the %£$! should I know?”  Further, because of this, the videos aren’t really set-up for offline viewing. Most of my study time is on the train commuting, so this was a big downside. For the ML and DB classes, the videos are divided into a few chunks with download links, which makes them practical for offline use.

Some of the pacing of material could be better -- the unit on probability that rapidly went from easy examples of a biased coin to some involved Bayesian probability calculations seemed particularly ambitious. There have also been a few technical issues caused by the large number of people using the site -- this one is entirely forgivable, given the size of the class, but the ML and DB class infrastructure has been very stable by comparison.

Again, to reiterate, I strongly support the concept behind these courses and appreciate the hard work that has gone into them. I’m also aware that there are plenty of people who are really enjoying the class. If I had more time available, I probably would stick with it, but it’s not really worth it for me at this point.

17 comments:

  1. I agree I have also stopped AI.I was losing interest and keeping up for ego sake.

    ReplyDelete
  2. I have gotten a lot of value out of doing the programming assignments from the sibling on-campus Stanford CS221 class, available here:

    http://www.stanford.edu/class/cs221/schedule.html
    ...and the lack of direct machine-grading hasn't been a big problem. The assignments contain a lot of hints about telling whether your solution is operating properly or not

    ReplyDelete
  3. The ambiguity in the two professors' wordings is also biting me. Very often, I can understand what an online quiz is asking only *after* watching the solution.

    Yet I must say that the ML class has its share of problems too. In particular, the video lectures are just full of misconceptions. For instance, in the lecture on backpropagation, Prof. Ng says that the small letter delta represents an error term, but this is patently wrong: delta is a sensitivity term, or a differential quotient, but not an error term in general. It happens to be an error term in the lecture because Ng is using some particular error function. For other error functions (e.g. the mean squared error), delta is not equal to the training error a-y.

    There are many other misconceptions as well, which I won't elaborate here. In short, I find the AI class inspiring but a bit painful to follow, while the ML class is very well presented but full of misconceptions.

    ReplyDelete
  4. There's a concept in psychology called "Learned Helplessness"(http://en.wikipedia.org/wiki/Learned_helplessness). It's where the subject has no control over the outcome, and simply gives up. A guinea pig will just sit in a corner stressing out ("when is the next shock"), or ignore the reward ("meh - I got fed again").

    I spent some 30 hours last week on the homework (I'm not exaggerating). I've come to realize that most of that time was spent stressing over the interpretation of the questions, not the meaning of the answers.

    For example, I knew perfectly well that the equation was either incorrect or invalid - but is the paren missing on purpose?

    Most of my time was spent scouring the blogs and chatrooms looking for hints from other students. The clarification did nothing to reduce the ambiguity.

    Similarly, I knew which axioms were incorrect and which were correct and which were correct+complete. Now, what does correct *for* the situation mean?

    I got burned out this weekend. Now I'm struggling with "learned helplessness":

    1) I Started the class with the intent to get 100% on all the quizzes. I tried really *really* hard to understand the question and come up with the right answer.

    Two weeks in and I wasn't as careful with the answers. Five weeks and I don't care at all. Bad answer? Oh - let's see what the explanation is.

    2) I started the class with the intent to get 100% on all the homework. I've got the time, I've got the motivation, and I've got the skills and maths background needed. At the Stanford level.

    Four weeks in and I got tired out on homework 5. I'm having trouble bringing myself to even view the lectures, they seem flat, uninteresting, even boring. I try cracking the book and my eyes glaze over.

    I'm looking at some serious motivational repair just to keep going. True, I've learned things, but I could learn them more easily by reading the book and researching on the net if I need more. And with none of the frustration.

    The sad thing about all this is the effect that the class is having on the *world*. AI is a rich and interesting field, and there's lots of fascinating unsolved problems waiting for the inspired amateur.

    The course is turning off large swatches of population who might be otherwise interested in the field. Everyone is forming an association between AI and frustration. The course is doing the world a disfavor.

    Here are some common counter arguments with subtexts.

    1) The course is free, so what? (Mean: Your argument is invalid because I don't value the course as much as you)

    2) If you don't have the skills or maths for a Stanford course, maybe you should quit. (Means: Your argument is invalid because difficulty is the same as ambiguity)

    3) I didn't think the questions were ambiguous (Means: Your argument is invalid because some small portion of the students don't think there's a problem)

    4) You're just whining because you didn't get a perfect score (Means: Your argument is invalid because I'm a dick)

    ReplyDelete
  5. ML rocks, AI is nice but the ambiguity is dissapointing, and that "questions first explanation later" style doesn't work wonders for me.

    @The suffocated: Actually, regardless of the error estimator used (squared, log, etc), in backprop delta IS interpreted as an error (especially for the final layer, for the hidden layer it could be interpreted as the actual contribution of each hidden neuron to the total error). Backprop unfolds just the same whether using log-error or squared error or any other estimator.

    ReplyDelete
  6. Poor Barrabas...you've tried too hard and put too much weight on getting everything 100%...I see a perfectionist there:P

    I think instructors are human beings; and as they are, they make mistakes. We have to be a bit tolerant on that. For me, as far as I have the time and I'd like to put time into it, and I learned SOMETHING, I'm happy with it.

    ReplyDelete
  7. I agree, ML is better in nearly every way. Still, the topics covered by the AI class (like Bayes and A*) are so fundamental that I feel like I should finish, if only to know what tool to use for what job. But I've resigned myself to getting an awful score on the exams because I keep misunderstanding the questions.

    (There was a particularly awesome quiz question where he drew two nodes, A and B, and asked if they were independent. I answered Yes but the answer was No, and in the explanation video he said that they were dependent because there was an arrow between them, and went on to draw the arrow!)

    ReplyDelete
  8. @Barrabas and @Matt R - I couldn't have put my feelings in better words myself. I'm still doing AI (although my heart is in ML class). I went through the whole disenhancement processs: 100% homeworks to 62% on HW4, 30hrs study/homeworks weeks, watching one video 6-7 times, questioning is it me or the course, mimicking solutions in homeworks without understanding principles behind. It's just one more experience in life I guess. I'm sure we learned a lot of AI but in painful and not time efficient way...
    Greg

    ReplyDelete
  9. Do you suppose Stanford is exactly like this.? I went to another Cal school a while ago, thy're not all the same.!

    ReplyDelete
  10. I am continuing the class, the experiment and smattering of topics is more than valuable to me. They are both accomplished in the field and are introducing what I feel are the key topics. True the length of the class has hurt the practical coding assignment approach in favor of more abstract modeling, but I am just taking this for ideas, I know A.I. is a field in need of new innovation, so we do not want to just get fed the mechanics, I want to think like an A.I researcher not just someone rushing to code for some project. True this clearly is not everyone's priority, I respect that, but for me. I will tough it out gladly.

    ReplyDelete
  11. @David -- Glad to hear you're finding the course useful. I would argue, though, that for many CS topics, AI included, it's hard to really thoroughly understand something unless you've either implemented something or played around with an existing implementation -- just the nature of the beast.

    ReplyDelete
  12. @wcaicedo: Yes, delta is interpreted in many textbooks as an error term. Such interpretation is actually quite widespread in the neural network literature. I understand that for pedagogical purposes, there is a need to explain things in a more layman way. And in a loose sense, delta can indeed be thought of as an error term. However, strictly speaking, it is the error per infinitesimal change in the neuron's excitation level, so it is a sensitivity term, not an error term. Very few textbooks or lecture notes nowadays get this straight. The slides for backpropagation in the MIT Opencourseware is one of the few exceptions.

    You say "backprop unfolds just the same whether using log-error or squared error or any other estimator". This is incorrect. When the sum of squared error is used, then on the output layer, the delta for the j-th neuron should be calculated as delta_j = [a_j * (1-a_j)] * (a_j-y_j), not merely the training error (a_j-y_j). You may read any textbook for this fact. Or simply google "backpropagation algorithm". Both the first two pdf links (link 1, link 2) returned by Google show the correct formula, but some websites (like this one) do mistakenly take delta as the training error without realizing that the formula for delta depends on the activation function as well as the cost function.

    ReplyDelete
  13. I can't share your emotions in many aspects. Sure, ML and DB courses infrastructure is much better. And ML course is in itself pratcice-oriented. AI is introductory class. Professors try to tell about many very different topics in one of the largest field of CS. I understand how this is hard for them to teach thousands of people at one time, essential amount of which hasn't sufficient math and CS background. I don't think this is a good decision, I prefer advanced course for more ready students. I hope that will be in future.
    I've bachelor degree, I've had several AI classes. And that was surprising for me when I learned something new in this online class. I get 100% for all HWs, and only few tasks were unsufficiently understandable or ambiguous. May be several peoples trying to find any traps in HW questions and therefore making too much attention on unclosed brackets, misprints and other trifles of life.
    Take it easy! This is a good course. Not wonderfull, not amazing, but just good introductory course.

    ReplyDelete
  14. I totally agree with you , and there are too many ambiguities in the lessons, and I really feel frustrated ...

    ReplyDelete
  15. Can't agree with you. I'm having a great time, learning skills I'm already (not yet finished HW #6) using in my work. Props to both lecturers. Sure, it's not perfect, but it's still damn good.

    ReplyDelete
  16. You are taking 3 classes at once? Do you already know the material or are you unemployed?

    I took a class at Columbia recently where the professor would skip many steps and it was difficult for me to follow. This is way better. I like the frequent quizzes and wish this unit included more. It is very helpful to do examples.

    Like many posters, I wish there were programming. I downloaded the first programming assignment from the actual class and did part of it. It is both rewarding and fun.

    As for you Barrabas, is it so important to get every problem right? It was noted that the missing parenthesis was to be ignored. Just do the homeworks quickly and try to enjoy the class. Anything over two hours is too much for this units homeworks. They were easy (and I may get some wrong.)

    ReplyDelete
  17. @Andrew -- I'm a full-time software developer, but I know, or have once known, some of the material (DB from work, ML and AI from university, albeit 10 years ago). It's certainly nice to have more time now!

    I wish I had found out about the programming assignments available from the real Stanford class earlier, they do look like a lot of fun.

    ReplyDelete