Need a book? I am surprised by this website: http://www.whatshouldireadnext.com/
Where I read about it, it called it "last.fm for books." The idea
is to start from a book that you like (or multiple books - I didn't
try the service: I'll get to that in a minute) and then run through
their current database of user's books and suggest a book for you.
Like last.fm, pandora... it's collaborative
filtering put to specific use (or maybe they don't use an
algorithm, they use people - dunno). It's wisdom of crowds. Sounds
reasonable, but here's my big fat question: Who needs this? There
are "problems" that I could use help with, that wisdom of crowds
might be useful to me: what email tool to use (if, say, I wanted to
switch from GMail at home & Outlook 07 at work), or whether to
install the new Mac OS (
Dave Winer would answer "no," what about everyone else?)... I
don't need this service: I have stacks and stacks of books that I
want to read and haven't. I keep Amazon lists of them (86). I also
have a boss's office with several more books not on my lists that I
want to read. Further, if I were to zero out those lists, I still
wouldn't have a problem: it's far easier and faster to add books
than remove them. I've been on a bit of a spree lately: in the last
4 weeks, I've finished
Know How,
Freakonomics, and
Peopleware. This has a been a good year generally, but those
were relatively easy and interesting reads and I had a plane trip
in there to help (Freakonomics was my YYC-->SFO flight). I'll
probably slow down over the next while, at least in terms of # of
volumes, since my current book is the 3.8 pound, 800 page,
Designing Interactions. Great so far, but seriously, 800 pages?
That's 3-4 books - I guess there are about 30 authors, so that's
forgiveable (and the quality is so high in the first 2 chapters,
that it's more than forgive-able), but still it's unusual - like a
6-hour movie would be. My
point in all this: I am a fast reader and enjoy
reading can easily outpace my reading with additions to the
list. For example, the references in Peopleware alone added 2 more
books to the list: so I'm actually moving backward! I don't plan on
ever completing my reading list. Amazon lets you rank them and,
honestly, the ones ranked "low" or, poor souls, "lowest" will never
get read: it's just my list of books that could possibly be
interesting. But seriously, are there people who read a book and
then wonder what to read next? Is the problem really "I'd read more
if only I knew about where to find more interesting content to read
about?" I don't get it. Prey My current "fun" book (read:
fiction helps me sleep) is
Prey, by Michael Crichton. Interesting, but I think it's a poor
choice: I don't think it's helping me sleep. It inspired some
thoughts about swarm behavior and I think I'll do a few fun
programming things to emulate emergent behavior, independent
randomized agent behavior, artificial learning, and learning
networks (in that order). Some of the stuff is a bit "out there,"
but some of it is actually true. I think his most insightful piece
(by research or luck) is the use of randomness in programming
learning, or at least programing analysis of data (analyzing &
understanding data = learning). But first: most
programming is very much not like what is described in
this book or in movies. Why not? Because programming isn't really
terribly interesting to watch or read about (not to non-coders,
anyways: it's too abstract). What programs do, on the other hand,
is very interesting. But for some reason, fiction about anything
technical feels the need to include the fictitious technology
behind it. I'm guessing it, ironically, helps the audience feel it
is real. However, techniques used in collaborative filtering (see
link above) sometimes do... Hmmm... Collaborative
Filtering I've
mentioned some of this before, but let's take the "what should
i read next?" question. You have a stack of user's books. And,
presumably they rate them. They might get fancy, but let's just say
that they rate them from 1 - 5. So you have a bunch of books that
Joe rated 3, some he rated 4, some 5 (cuz, really, unless you
really disliked a book, are you going to take the time to enter it
into the system just to rate it "1?" - maybe some people). Anyways,
if you look at all of Joe's books, you can find patterns: lots of
books that are westerns gets 5s and sometimes 4s. Good, you learned
something: Joe likes westerns. Then you wonder, what about those
western 4s - why didn't he like them as much? You notice that most
of the western 4s are written not by Louis L'Amour. Good, you
learned something else: he really likes Louis L'Amour. You can now
predict a few things about Joe's book preferences. Now, say John
taps in his first book: Valley of the Sun, by Louis L'Amour (you
can see where I'm going with this). Which book would you suggest to
him? Maybe one of the one's that Joe rated a 5. That's
collaborative filtering. Except computers do that many, many more
times (though not too many times - seriously, it's called
"overfitting") with each user to learn many different "rules" to
predict behavior; they look at many more users; and, they don't
know what the rules really are. You can articulate: it's "Louis
L'Amour," but the computer just knows that this group of books goes
together - it don't know it's the author (and it doesn't matter for
functional purposes). That brings up a very interesting point about
machine learning: it can't necessarily articulate what it has
learned or what it means in our language. A human would be able to
(for relatively simple rules, anyways). Let's say you wanted to map
these on a page: say you put a dot for each book, and then put dots
closer or further away from each other depending on how much they
related to each other, you could end up with an interesting "
map." (the link is of movies, not books, but the concept is the
same.) Why? because it helps you visualize what is going on and, if
you are dealing with thousands of these, it's hard to understand in
a list of say, 18,000 movies clustering in groups: like this image:
But where would you put the first dot/book? You'd
put it in the center maybe. The second? Nearby, if it was related,
but to which side? Now push those questions out thousands of
times... How do you figure out where to put dots based on their
relationships and, more importantly, if you want to have them
closer or further from each other, how do you know where to put
them if there are thousands of related dots/books all "pulling" on
that dot with different weights. Back to Prey and randomness. It
turns out that randomness is a good way to go. You introduce some
guessing ("dot 2 goes left this time") and, once you've put dots
down, you take more guesses ("what if we picked this dot up and
moved it over here: is that better or worse?"). That may seem
inefficient, but remember, a computer can make hundreds of guesses
in the time I wrote this sentence - and it would take a human a
long time to program a "better" way. It's sometimes called
measuring "energies" or "forces" (push/pull from other dots) and
"local maxima":
really nerdy academic pdf article here (check 2.4.1 for
minimizing local maxima with random "jumps"). Aside: Fun quote from
the article: "However, the energy 'surface' for thousands of
vertices[aka "dots"] is so chaotic (both spatially and temporally),
that, in practice, we have found the simpler method performs
better." read: we tried other ways cuz you would think that would
work but, honestly, it was too hard and the random thing worked
better than what we tried. Sometimes you don't really understand
something until you explain it: your brain was churning away in the
background (while you slept) and had the answer ready: you just had
to ask for it in a different way (instead of "how does this work?",
"how do I explain this?"). But the mind working while you sleep, is
a whole nother discussion. And not relevant for this evening,
clearly. I'm guessing the evil "eat people" of the agent swarms in
Prey is probably why I'm still awake right now. I'll go try method
#2 for sleeping: football.