Book Thoughts 2011

I began 2011 with a bold new year’s resolution: Consume more books than movies. I failed miserably. Movies are just too easy to watch. But I still found time to read some great (and some not so great) books:

The World is Flat by Thomas Friedman

I put off reading this for a long time. Friedman sheds some light on globalization for us laymen, but the value comes from the specific examples he gives about real companies. It’s full of “that’s interesting” moments.

Breakfast of Champions by Kurt Vonnegut

“The story isn’t as important as the writing.”

–Jodi Z, right now

My girlfriend made this remark when I told her I couldn’t remember why I enjoyed Breakfast so much. The joy of Vonnegut comes from the short passages that happen to be particularly poignant.

The Big Switch by Nicholas Carr

I recommend this book to any layperson who mentions “the cloud”. Carr walks the reader through the history of electricity as an analog to computing. It’s a compelling read, but it wasn’t enough to get me to take a job at a certain San Francisco PaaS company.

The Book of Sand and Shakespeare’s memory by Jorge Luis Borges

Lots of fun little inception tales in here. Borges must have been a writer for Arrested Development because I could read this thing again and it would still be just as good. Fun fact: In the edition of the book I read, Borges gives a shoutout to UT’s Perry-Castañeda Library where I checked out the book itself.

Hitchhiker’s Guide to the Galaxy by Douglas Adams

You’ve already read this so I won’t discuss it. I will tell you that one of my friends used to say “That’s so 42, dude.” in high school. Bullies were invented to sniff out phrases like that.

A Scanner Darkly by Philip K. Dick

I wish I had never seen the film, because I kept picturing Keanu Reaves’ rotoscoped face. The narrative was great, but the druggie lingo started to wear on me by the end.

Little Brother by Cory Doctorow

This was a ton of fun. I immediately gave it to my little brother as a sneaky way to teach him about public key cryptography, RFID, and sticking it to the man.

Machine of Death by various authors

I’m not sure if “concept book” is a real term. If not, consider it coined. The concept of this anthology is that there is a machine that gives you a short vague description of how you will die. About 20 different authors give their takes on the idea in the form of short stories. It’s a good read because many of the authors are pretty clever.

The Restaurant at the End of the Universe by Douglas Adams

The first original was funny, so I really had no choice about reading this one. The obnoxiously cheerful computer and the paranoid android are the number one reason to read these books.

Making the World Work Better by various authors

This book was given to every single IBM employee. It’s three accounts of IBM’s rise, near-fall, and resurrection. I love Watson’s original motto: “Think”. It’s so basic, such a far cry from the bullshit corporate mission statements of the last few decades. But the best part of the entire book is the paragraph where the author hand-wavingly explains that IBM is totally not liable for that Holocausty stuff and basically solved Apartheid. No big deal.

The Wave by Todd Strasser

The target demographic for this book is likely preteens, but I picked it up because the wikipedia article was intriguing. Poor decision. It’s an embellishment of a true story in which a high school teacher demonstrates how easy it is for a population to get swept up in Nazism.

White Noise by Don DeLillo

Probably the most “important” book I read all year. I’m reminded of it whenever I hear the reverberations of advertising ringing in my ears. “This program brought to you with limited commercial interruption by AT&T.”

Robopocalypse by Daniel H. Wilson

This book is exactly what it sounds like. The narrative is told through journal entries detailing many different subplots that eventually converge. I used to find frequent context switching kind of annoying, but now it’s somehow more appealing to my web-addicted brain. The survival section could have been a lot longer. I’m not really convinced that the humans could overcome their new robot overlords in 50 pages. It could have been drawn out and split into several volumes. Can’t wait for the movie.

Ready Player One by Ernest Cline

This was so easy to finish because it’s a thriller-for-nerds. Cline knows how to relate to his target demographic. At one point the protagonist is inserted into War Games and has to “play” as Matthew Broderick’s character. It’s good to know that there’s a substantial number of people who also aspired to be like him.

I’ll be reading a lot more in 2012, as per my new year’s resolution. A kindle showed up in the mailbox today so I picked up some books– 1000 of them.

An Interesting Musical Parallel

A recent episode of Radiolab discusses William Basinski’s disintegration loops.

[…]William Basinski, a composer who loops analog tape to create a unique sort of music. One day, Basinski dug up some of his old tapes, stuck them into his player, and heard a melody in the throes of death.

They’re beautiful. They sound like this:

After listening to a few of these I realized there’s a whole other musical domain full of short loops and slowed down melodies: The chopped and screwed style of Houston Rap. Here’s a quintessential example:

Basinski has worked in New York for the past 25 years, but where was he born?

Houston.

Real Time Indoor Tracking for Android Phones

For my senior design project I built a system with a few friends to track Android phones indoors using Wi-Fi signal strength. The technique we used is called Wi-Fi fingerprinting and has been detailed in a few academic papers [1][2]. A funded startup called Wi-Fi Slam is attempting to commercialize the same concept.

Update 3/23/2013: WifiSLAM has been bought by Apple for $20 million. There’s a lesson here somewhere.

The Product

The Android app looks like this: The web app allows you to track many phones. It looks like this:

Indoor Tracking Web Application

Source on GitHub:

Remote Monitoring Application

Mobile Tracking Application

Mobile RSSI Recording Application “Trainer app”

Disclaimer: Much of the source is a mess. I wrote the RMA and the mobile UI plotting code.

Background

The project is driven by the fact that GPS performs poorly indoors– the receiver chips just can’t receive reliable signals through roofs. We decided to get around this with Wi-Fi. The original specs called for a personal inertial device built on Texas Instruments hardware. However, we weren’t too keen on designing an original embedded system and the math required to coordinate a gyroscope, accelerometers and e-compass was unpalatable. Our solution is to construct a table of Wi-Fi signal strengths. At many points inside the building, we recorded X-Y coordinates and the RSSI values from each available access point. We could then perform a lookup in this table to find the nearest point and tell the user that he was at that location.

Overview

The design consists of two subsystems. The first is an application on an Android phone, and the second is the Remote Monitoring Application that runs on a web server. The smartphone application sends real-time position information to the RMA. The system overview below enumerates the inputs and outputs of each subsystem and the interaction between them. The mobile application reads Wi-Fi RSSI values and turns them into positional data. The RMA receives these coordinates and plots them on a map in the user’s browser.

Real Time Indoor Navigation System Diagram
System Overview

Android Application

The mobile application is an Android binary that users run on their smartphones. It tracks the mobile phone’s changing position using Wi-Fi fingerprinting. Fingerprinting allows the application to compare current RSSI values from nearby Wi-Fi access points to previously gathered values in a table stored on the mobile device. The stored values correspond to a physical location, thereby providing an accurate position reading. The application constantly displays its computed position on a map and relays that position to the RMA over an available Internet link (Wi-Fi). The position lookup itself is O(n * m) where n is the number of rows in the table and m is the number of access points sensed. Two nested loops compute the dot product[3] between the current RSSI values and those in the table. The point with the minimum difference is considered to be the closest point and sent to the RMA. The interesting code is in Map.java in the PointWithRSSI method.

Mobile Indoor Navigation Application Diagram
Android Application

Remote Monitoring Application

The RMA is a node.js web application built on the Express framework. Coordinates are received by the server as JSON-encoded post requests and pushed to the user’s web browser in real time using the socket.io library. The client-side JavaScript component plots these coordinates on an OpenLayers map. It supports an arbitrary number of phones as long as they report a unique device_id. The mobile app code uses the Android ID for this. The server-side code also stores the data received in MongoDB for future use. The original project requirements called for heat maps, but they were not completed. My implementation is hard-coded to the ENS building, but if you want to adapt our code, you’ll find that OpenLayers is pretty easy to work with. It wouldn’t be a stretch to swap out the image. Pay close attention to the bounds in map.js. They define the coordinate system used for plotting. The RMA also includes functionality to help train the mobile app. Clicking at any point on the map displays the XY coordinates of that spot. We entered these values into the trainer app when recording RSSIs.

Remote Monitoring Application
Remote Monitoring Application

Aftermath

Long story short, we had everything working for the open house. The accuracy was spotty– about 6 meters. It seemed to perform better in the hallway where the access points were installed rather than the rooms. Accuracy may have been improved by recording more points. In the end, it was enough to convince the professors. We ended up winning second place and taking home a modest cash prize. I wore that obnoxious shirt to troll Merwan for requesting business casual attire. It went okay.

Shoutout to Tim’s wife for doing the poster

This project was more of a lesson in project management than anything technical. We had another team member not pictured above because he just didn’t show up to anything. Not even the open house. Early on we made sure his work would be limited to the wholly independent heat map component. Sure enough, it never got completed.

References

[1] http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1007&context=cpesp

[2] http://dl.acm.org/citation.cfm?id=1451884 [3] The operation done for each BSSID column in the table is: difference += (stored RSSI) ^2 – (stored RSSI * current RSSI).  After the outer loop iterates through each row, the minimum difference tells you which row has the XY coordinates you need. This makes sense, but I’m not convinced it’s actually a dot product. Parts of this post were taken from a Testing and Evaluation plan coauthored by Rohan Singhal, Tim Osborne, and Merwan Hade, and myself.

Update: 11/29/2012: Today Google announced a very similar product: http://googleblog.blogspot.com/2011/11/new-frontier-for-google-maps-mapping.html

Update: 3/8/2018: The forthcoming Android P will support indoor positioning by providing explicit round-trip-times to access points. https://android-developers.googleblog.com/2018/03/previewing-android-p.html?m=1

The MacBook Air: It was supposed to be so easy

I finally bit the bullet and picked up a MacBook Air. Ubuntu 11.04 had a problem with remote folders breaking after waking up from a suspend. That was the last straw. I’m fine with paying more so I can spend time hacking on web stuff instead of troubleshooting desktop issues.

For all the hype, I expected a machine descended from the heavens; a laptop from the desk of Zeus. These religious aspirations for a notebook were a bit misplaced because there are a number of problems with the Air’s experience. This is not a critique of Lion, the hardware, or any one organization in particular, but a quick dialectic on the overall user experience, including third party applications.

Problems:

  • No out-of-the-box support for remote folders via SSH. Ubuntu makes this a breeze. I spent way too long following blogs to find out how to set this up. I eventually bought an app called transmit that does the job.
  • I’m expected to pay money for a decent text editor? Notepad++ and Gedit are both fantastic and free. TextMate better be the best thing since canned bread.
  • Homebrew depends on Xcode which is over 3 gigabytes. There isn’t a ton of space on this SSD.
  • Have to buy $15 Thunderbolt to VGA adapter to use with my LCD screen.
  • Sometimes the command key ⌘ functions like a Windows control key, as in copy-paste, but other times it doesn’t, like when switching through tabs in Chrome.
  • Mail doesn’t work. Gives IMAP error when trying to log in to Gmail. Probably due to my Gmail settings, but no indication of this.
  • When compiling stuff the fan really makes a fuss. The hot area is small, but too hot to touch.
  • Brew fails to compile dependencies for SSHFS. Not going to troubleshoot this.
  • When computer speakers are plugged in, I can hear very faint interference from a TV or radio station. Seriously.
  • “Delete” key appears to do nothing in Finder. I assume this is because it’s really the backspace key.
  • When using a USB mouse, the scrolling direction is reversed. I can change this in the settings, but then all the touchpad gestures are reversed.

Bright spots:

  • Boot time is unbelievable. I tried to time it but it completed before I started the timer.
  • Terminal included, Chrome and FF easy to set up.
  • Keepass + Dropbox work exactly the same way as they do on Windows, Linux, and my Android phone. Way to be consistent.
  • Heat is contained to the area above the function keys.
  • Dead silent when fan isn’t spinning.
  • Supportive community with lots of recommendations
  • Keyboard shortcuts and gestures will save a lot of time.

A number of positive things go without saying. I didn’t have to hunt down Wi-Fi or video drivers and I didn’t have to re-install the OS just to get rid of bloatware. But, for $1,343.79 I expected a little more divine intervention.

On the whole, the problems listed are pretty minor. I was able to get a comfortable working environment set up in a few hours. I’ll be spending inordinate amounts of time with this machine for the next few years, so send some OSX productivity tips my way.

Edit: Disregard everything above this line. This is why it was worth it:

Update 11/24/2011: Buying any other computer would have been a poor choice. I carry this machine everywhere. Keyboard shortcuts and gestures really are worthwhile; I don’t use an external mouse anymore.

Track Daily Earnings Across All Affiliate Networks

I’ve had the misfortune of becoming acquainted with the seedy, spammy world of affiliate marketing. This misfortune begat several web applications that generate modest amounts of revenue each month. Finding the total revenue is a chore because most marketers run campaigns from several different affiliate networks. When I want to see how much money I’m making, I don’t want to log in to three or four different sites.

To mitigate this mild inconvenience, I built Earnings, an OO-PHP app designed to scrape daily affiliate network earnings from the myriad sites that owe you money. It looks like this:

Source on github

Problem Description

Most affiliate networks don’t have APIs, so earnings data must be scraped from the HTML. Any scraper must get past the login screen and navigate to the earnings page to find the earnings data. This is problematic because the sites change often, breaking the login code and string parsing used to extract the data. Additionally, networks come and go. When there are no offers worth pursuing, marketers will have no need to view the earnings from that particular network. Furthermore, not all marketers work with the same networks. Any solution must make it easy to select which networks that will be scraped.

Solution Overview

This implementation takes advantage of object orientation in PHP5. The server performs the actual scraping in PHP classes that implement the abstract Network class. These classes are responsible for determining whether or not they have retrieved good data. If earnings data is unavailable due to bad credentials or a modified source site, a JSON-encoded error response is sent back to the client. Credentials for each affiliate network are stored server side in networks.json file.

The javascript client makes jQuery POST requests to authenticate with the server and get the data for the earnings list. Application state is maintained in the earnings_state global.

Limitations

I haven’t exposed any server-side methods to grab earnings from individual networks. The client can only grab the entire batch from the getEarnings javascript function. This is problematic because the calls are not multithreaded and it takes about 5 seconds to serially scrape earnings from 3 networks. Changing this wouldn’t be difficult. A little logic in earnings.php just needs to interface with the getEarnings PHP method of the Network class in question. The client could then call them all asynchronously.

Since all the earnings data lives scattered across the web on affiliate network sites and isn’t persisted on the server, it seems more efficient to cut out the PHP altogether and do everything client-side. Credentials could be saved in HTML5 localStorage. The major barrier is finding an elegant way to do cross-domain POST requests.

Finally, the getEarnings method has two date parameters to let the callee select a date range. These are ignored in the three Network classes I’ve written so far. They scrape earnings for the current day no matter what.

Contribute

Right now this app only supports CPAway.com, Copeac.com, and Maxbounty.com. I set it up so that it’s easy to add support for other networks in a modular fashion. To add your own network, do the following:

  • Fork Earnings on GitHub
  • Write a class that implements the abstract Network class. Look at Cpaway.php for an example of php curl use.
  • Add an entry in networks.json
  • Send me a pull request so we can have a comprehensive earnings scraping solution.