quarta-feira, dezembro 02, 2015

Machine Learning Made Easier (or NOT!): "The Master Algorithm" by Pedro Domingos


Published September 22nd 2015.


How can one become an expert in ML? All one needs is a basic background in (multivariate) Calculus, Linear Algebra, and Probability. ML is math. If one wants to understand the techniques, one has to understand the math. No shortcut. If one wants to start looking into the field of ML, this book is for you. If not, stay well clear.

My background is in computer science and software engineering and I've been interested in ML since I can remember. In 2013 I took Andrew NG's ML class at Stanford University (for those of you who want to dive into stuff like this here are mynotes of the class; while learning the needed math can look daunting at first it is actually quite fun once you get into it), and I was never literally the same…After that I made some Python coding to get a feel for the real thing, which I’m still doing to this day.

Humans ARE machines, albeit biologically-based. Billions of highly interconnected neurons receiving sensory input, lots of internal feedback, and signals that go out to motors, etc. Emotions, feelings, consciousness, are all just “concepts” we've constructed through a mixture of self-introspection and communicating with other self-introspecting machines (humans).

Some questions important to me in the field of ML:

"How does one simulate emotional pain input and response in a robotic system?"

Simulation is not the goal, (Searle was sort of right, but I don't think any simulation would ever pass the Turing test anyway). But we will (eventually) build a real machine, with billions of interconnecting self-modifiable neurons, then we take a screw driver and poke it in a sensor and ask it if that hurt. Assuming we've wired up some vocal system, it will scream "YES!".

"What is the true purpose of emotions in human beings?"

Emotion is just our own thoughts, with some extra chemical feedback thrown in, resulting in what we call 'feelings' (anger, elation, euphoria, depression, etc). The machine we build will claim to have all these, and who are we (as external observers) to argue with it? When it sticks a probe into an electrical socket, it will hurt, and after a just a few trials, it will probably avoid it because the negative pain outweighs the curiosity.

"How does the animal-like instinctive brain and the emotional brain aid or hinder the logical learning brain?"

This division (and hence any resulting interaction) is a contrivance. But I think the point you are raising can be addressed with a simple: all aspects (neurological firing patterns) are interconnected with multiple levels of feedback loops etc. Everything affects everything. This is basically how our brains are wired up.

"How would creating a human like self-thinking/self-evolving device such as a humanoid robot effect society and morality?"

How does creating a human (self-evolving child) such as a humanoid effect society and morality? As most responsible parents will say, one has to train and teach the child to choose good behaviours over bad. A robot-creator could use moral stories, time-out, spanking (poking that screwdriver), or whatever method is in vogue in their time/culture. There's no end to the methods humans have created to choose from.

If one just drops the idea that humans have any mystical “thing” inside them (mind, soul, consciousness, whatever) and realizes that humans are just machines and nothing more, (albeit extremely complicated and marvelous), then real AI is just a technical problem. We've just not built it yet.

Domingo’s book gives us several ML algorithms/paradigms without delving into the math. I understand why the author did this but I can't say I agree with the approach. It is important to realize that this book is not an easy one, despite being devoid of math. In order to “understand” ML, one will need to understand the math behind all the algorithms. Having said that, it is not really a gentle introduction to the topic.

One of the topics that didn’t get much exposure in Domingo’s book was “can we use ML to vectorize?” Genetic programming should be the “tool” to use to create code to vectorize. In Domingo’s book thus topic did not get much exposure time.  Genetic algorithms may be able to learn such algorithms... just as you may be able to measure the width of your bookcase with a super-high-resolution, super-powered GPS-type system.  It doesn't make it the best tool: use a measuring tape!  The amount of time it would take a genetic algorithm to learn to be as effective as some of the recipes we have learned here is very long, I suspect.  So finding someone who knows what they're doing will still be a more effective strategy than getting a genetic algorithm system and just waiting and hoping it will work.  And, even then, ML expertise will be valuable in ensuring the genetic algorithm is not overfitting to training data, in tweaking it so it will reach a better solution faster, without falling into local optima.  Finally, there is well-understood mathematical theory behind the algorithms we learn.  There are proofs about how long each algorithm will take, how effective each is, how well it will scale to large data, and other aspects.  The results of genetic algorithms will only have one property: the genetic algorithm that produced it says "Trust me!  It works (as far as I know)!”

I was expecting to see more stuff on search engines, but that didn’t happen. I think Domingos thought it was a very specialized topic to approach in a book of this kind. The thing about search engines that is really interesting is that people are constantly trying to manipulate them to get their sites to rank. For this reason the ML algorithms have to be incredible complex so they can not only show the best results logically but also work out what sites are the sites which seem too good, but are actually very clever spam. I'm pretty sure that there must be a fair bit of ML going on in that. Just the amazing accuracy of the spelling correction in Google alone is pretty impressive. I thinks it's at least 20 times better than at working out my clumsy keyboard mashings. Hyperlinking is also a major factor in how they rank pages and this is one of the most manipulated and spammed areas of the internet. Check out pagerank on Wikipedia. They must use tons of ML to detect patterns of links and work out if they are natural or not. Thing is the algorithm is so good it's not really worth manipulating but it's very interesting to learn about.

Jason Brownlee released a preview of his new book titled “Clever Algorithms: Statistical Machine Learning Recipes”. The beauty of this book is that it is all about code. Each chapter tackles a particular machine learning algorithm. For each algorithms he gives a strategy, heuristic and usage code implementation in R programming language. According to the author: “Implementing ML algorithms is difficult. Algorithm descriptions may be incomplete, inconsistent, and distributed across a number of papers, chapters and even websites. This can result in varied interpretations of algorithms, undue attrition of algorithms, and ultimately bad science. This book is an effort to address these issues by providing a handbook of algorithmic recipes drawn from the field of Machine Learning, described in a complete, consistent, and centralized manner. These standardized descriptions were carefully designed to be accessible, usable, and understandable. An encyclopedic algorithm reference, this book is intended for research scientists, engineers, students, and interested amateurs. Each algorithm description provides a working code example in R.” He has written another book title “Clever Algorithms: Nature-Inspired Programming Recipes” in which code is in Ruby. This ruby code is actual implementation code of the algorithm. Both of his books are free for online viewing and are a goldmine for anyone looking for actual code.

Does simplifying the material to suit a mass audience actually help them to learn/get acquainted, or give them just enough knowledge to be dangerous?  Simply put... it's ML, it's complex, and it's inherently mathematical. As an introduction it was medium-high, and Prof. Domingos has done an admirable job explaining the concepts, but it still felt too breezy and tended to emphasize at times (perhaps erroneously), that knowing how to turn the knobs and when to turn them is sufficient and in some way the 'stuff' of ML anyhow so let’s not bother with the mathy bits. I can't say I entirely agree, but I am grateful for the experience and hope that the focus going forward for these kind of books is more rigour, i.e., a better balance between applying the ideas and understanding what drives them. 

But what are we really seeing here? I mean, what kind of people are interested in ML?  Developers, scientists, engineers, etc. These are the kind of people who are generally always learning. As a group, we most likely chose our careers because we enjoy learning so much. Contrast our group with just walking down the street and asking randomly selected people questions like: how many non-fiction books have you read lately? What new things have you learned in the last week? And how old are you? Would the percentages be anything like our group? Would we see any correlation between age and learning? (Now I'm curious)

But for us here, I think we are all pretty much addicted to learning, and will be for the rest of our lives, and as such, present a lot of bias that makes it harder to query for correlations like age and learning...

NB1: Disclaimer. Pedro Domingos is a fellow-countryman of mine, working at The University of Washington in the field of Computer Science & Engineering.

NB2: Unfortunately, some math skills are an absolute requirement to fully appreciate this book, in my opinion.  The good news, is that you don't need calculus to apply ML techniques like linear regression, neural networks, SVMs, etc.  Algebra, basic linear algebra, and descriptive statistics will take you a long way toward your goal.

In my opinion, here are 4 core skills necessary to learn basic machine learning techniques well enough to apply in your job:

1) Linear algebra.  There are some online courses and inexpensive books that can help.  You must understand basic matrix options like inner product, transpose, and inverse.  There was a Coursera course last July called Coding the Matrix that would help you understand how linear algebra is applied to problems. I think it's still available offline.

2) Basic probability and statistics.  There are a couple of online classes at Udacity.  You need to understand descriptive statistics (mean, standard deviation), and probability.  These skills are necessary for understanding linear regression, and almost everything yet to come in a ML class.

3) Programming skills.

4) Algorithms.  I think one of the hardest parts of this field is translating the algorithms that are expressed in mathematical form (like Professor Ng's cost functions, etc.) to code.  If you don't have a math background, then it's easy to be intimidated by this.  I've found that breaking the math down as much as possible helps.  It takes time, practice, and persistence.  I don't think there's any class that teaches this as a specific skill - it's just something you have to work on.

NB3: ML = Machine Learning; AI = Artificial Intelligence.

4 comentários:

Luis Franco disse...

Muito interessante! Vou ver os livros sobre"clever algorithms"! Sempre adorei algoritmia!

Book Stooge disse...

Do you really think that humans are just biological constructs without a soul?

Manuel Antão disse...

Book Stooge, the operative word here is "if". I quote: "If one just drops the idea that humans have any mystical “thing” inside them (mind, soul, consciousness, whatever) and realizes that humans are just machines and nothing more, (albeit extremely complicated and marvelous), then real AI is just a technical problem. We've just not built it yet."

Thanks for reading till the end.

Book Stooge disse...

Thanks for replying and sorry my reply is so late. I don't let blogger email me when responses are made.

Glad you pointed out that "if". It was lost on me in the paragraphs. And answers my question all at the same time. Cheers!