really, nothing here

software geek

30.6.07

iPhone Pity Party

I've owned my iBrick for 24 hours now (but I remain too cool to have stood on a line at least). However, I still haven't been able to get online. I seem to have committed the cardinal sin of moving. Somewhere buried in the number portability laws as a nuance about only being obligated to move numbers for mobiles in the same metro area. As far as I can tell, the other telco's are now remembering this portion of the law and not allowing phone transfers.

Fortunately there's a way around this, and appleinsider has taken the time to document how you can get around these problems. But that only works if your original activation attemp doesn't get stuck in limbo. Which mine has. Joy.

At least I'm not alone, there's a growing thread on gizmodo.

Labels: ,

28.6.07

Thom Mayne

Thom Mayne's San Francisco Federal Building had me awe struck when I saw it yesterday for the first time. The building makes use of various materials that come out at incredibly sharp angles and there's an amazing use of negative space in the upper levels of the main tower along with exterior spaces that are traditionally interior (like the stairwells). It's a wonderful thing to see and I recommend it to anyone visiting the San Francisco area. Incredibly the building qualifies for LEED certification and uses less than half the energy of a traditional office building through aggresive use of shade. All this while being completely compliant with post-9/11 DHS security measures.

Labels: ,

Miya Ando Stanoff

Miya Ando Stanoff made the rounds on the internet a few years ago (I was sure I first learned about her on CoolHunting, but I can't find it up there anymore). She works with flat cold steel that's treated with tempering heat (I think) and pigments. The result is a really cold effect that's very soothing to me. I had the privilege of seeing her work in person in San Jose a little over a year ago, and the effect really is quite stunning. Her site has a new movie on it that makes me all mellow.

Labels: ,

26.6.07

Miranda July

Miranda July takes the role of Angelina Jolie in my circle of friends as the woman most likely to make other women leave their men.

So why is July so lonesome? No One Belongs Here More Than You really never hits any notes other than longing and so its been a bit of a slog to go through the never ending series of pretty, pithy and pathetic stories of unrequited love. Each one of these stories ranks up there in the better literature of the year but in general I have to say that I feel July needs to break out of her mold and try something new. Like inventing even cooler viral marketing campaigns.

Labels: ,

19.6.07

New project

Like there's not enough going on in my life with the internship, netflix challenge, pico and (re-)learning analysis now I'm gonna try to do some work on deriving social networks from a two dimensional data set. The idea is to look for actors working in coordination and build out nodes in a social graph based on their coordination. Then rebuild a two dimensional data set based on that graph and do it again until you can't orthogonalize the data set anymore. What results should be an approximation for the information network used to move data between all the actors. Maybe.

So the question is, for all two of you reading this blog. What's out there in the literature about this sort of thing. Other than my alma mater who's trying to build social networks from indirect measures, rather than just interviews?

The topic will be picked up again in the near future.

17.6.07

More Funk

A few more notes on the Funk algorithm for those people trying to use it.

1) Do keep in mind that the algorithm isn't a solution to the problem per se -- since the parameters are all the synthetic you've got a lot of work to do in selecting their values and there are a lot of interesting techniques to use to go about figuring them out. My personal favorite for this system is k-cross validation. But you can try other approaches as well.

2) Funk mentions that the penalty term in the fitting algorithm is related to Turgenev regularization. It really isn't, because these techniques don't really check for the balance of the parameters, like Funk's term does. That's not a bad thing, it's likely what makes this system work so damn well, but you've probably got a good chance of improving things further by introducing a AIC or BIC based regularizing term to your RMSE fitting calculation.

SVD is dead, long live SVD

Simon Funk (a pen name) pretty much single handedly raised the bar on what can be considered "barely competent" collaborative filtering when he published a speedy, stable and more or less correct SVD solution to the Netflix prize. It's a huge contribution to the community and, althogh it hasn't begat a succession of disclosure regarding incremental improvements to the algorithm (presumably my mucking around in Simon's postulated non-linear "G" function) its clear from the incredible volume of scores centered on Simon's original RMSE of .90 just how important this contribution is.

I wish I had something as fundementally interesting to contribute to the discussion, but at this time I'm only just getting back to working on the prize after a pretty rough semester at fake grad school. What I can say is that as nice a solution as the SVD with non-observable regression variables is, its incomplete, and probably not extendable into the winning solution space. This isn't conclusive, of course, and there's a lot of work that could be done to prove me right or wrong having to do with backing out probability intervals on the discovered regression variables, as well as working out what introducing observables, as well as unobservable, regression variables does (I'm guessing that's how Simon moved up to .89 -- or maybe he mixed his results with yet another team).

The real problem, however, is in the assumption that components add up linearly to composed an ultimate value. There are a couple of takes on this, but the combination of uniform density, clipping at max and min values, and linear combinations seems destined only to adequately represent the average user's reaction to mediocre movies. There's a ton of work on how responsive individual's really are to "quality" and none of it suggests that the reaction is in any way linear. Furthermore, that isn't really the interesting bit of the question -- "double averaging" or averaging the average score of a user with the average score of the movie yields results that really aren't that much more wrong that the SVD results (.97ish vs. .90ish).

So let's pretend for a second that you figured out which of the seven or so decent distribution choices you use to represent user responsiveness to quality. Now you've got to figure out how you're going to handle the issue of self-selection. This arises because, well, people can only rate what they've seen (well, honest people) and people only see things they have a reasonable shot at liking (well, normal people). Salon's machist, Farhad Manjoo, writes this up better than I can. So now you have two interesting distribution issues that I can't wait to get cracking on: what's the distbribution of a user's knowledge of the general responsiveness they will have to the movie's quality; what's the distribution of the user's probability of even trying this movie out. But wait, don't start plugging in your heirarchical models just yet -- because you only care about these distributions during the training (regression) of your model. -- during the prediction phases users don't get a chance to self-select, so you should just spit out the results.

(Well maybe, the Netflix corpus kinda sucks with regards to little things like just who made what recommendations under what conditions and frankly, though what I said holds true for a real production system, its entirely likely that all of the inputs in the Netflix corpus, including the holdout set, are self-selected reviews subject to the previously mentioned fun and games with heirachical modelling).

Like other posts -- this one keeps growing, come back for results and more thoughts.

14.6.07

Off to SFO


I'm moving to San Francisco today for the summer. I'll be working as a summer associate for O'Reilly AlphaTech Ventures and I couldn't be more excited. For those of you not in on the O'Reilly brand - Tim O'Reilly gets the credit for bringing the Web 2.0 term to the mainstream with his canonical Web 2.0 conference a few years back and has been a huge voice in open source software and produced the majority of the manuals you might want for said software for decades. My managing directors seem really laid back and nice and I can't wait to meet them in person.

For those 2 of you reading this blog who still live in tbe bay area, I'm looking to hang out with you while I'm out west. Don't be a stranger.