E-Scribe : a programmer’s blog

About Me

PBX I'm Paul Bissex. I build web applications using open source software, especially Django. Started my career doing graphic design for newspapers and magazines in the '90s. Then wrote tech commentary and reviews for Wired, Salon, Chicago Tribune, and others you never heard of. Then I built operations software at a photography school. Then I helped big media serve 40 million pages a day. Then I worked on a translation services API doing millions of dollars of business. Now I'm building the core platform of a global startup accelerator. Feel free to email me.

Book

I co-wrote "Python Web Development with Django". It was the first book to cover the long-awaited Django 1.0. Published by Addison-Wesley and still in print!

Colophon

Built using Django, served with gunicorn and nginx. The database is SQLite. Hosted on a FreeBSD VPS at Johncompanies.com. Comment-spam protection by Akismet.

Elsewhere

Pile o'Tags

Stuff I Use

Bitbucket, Debian Linux, Django, Emacs, FreeBSD, Git, jQuery, LaunchBar, macOS, Markdown, Mercurial, Python, S3, SQLite, Sublime Text, xmonad

Spam Report

At least 236610 pieces of comment spam killed since 2008, mostly via Akismet.

My 100x ROI as accidental domain speculator

One of the hazards of working in the web biz is impulse-buying domain names.

Back in the Web 2.0 boom days, there were a lot of “social” web plays with silly names.

I thought I’d satirize this by registering numbr.com and making a social site where you could “friend” the number 7 and that sort of thing.

I never got around to building that site. However I did get a curious email one day from “Joe” who wanted to know if I’d sell the name. He was with a startup that was going to offer temporary phone numbers for Craigslist postings or something. After some back and forth, we agreed on a price: $1000.

For a joke domain name I paid $10 for.

Tuesday, September 26th, 2017
+ +

Neo4J and Graph Databases

noSQL is a big tent with lots of interesting tech in it. A few years ago at work I got an assignment to evaluate graph databases as a possible datastore for our 40-million-pageviews-a-day CMS. Graph DBs are elegant stuff, though not a particularly special fit for that application. Here's what I had to say.

Graph databases are all about "highly connected" data. But instead of tracking relationships through foreign-key mappings RDBMS style, they use pointers that directly connect the related records.

These relationships can also have directionality and descriptive properties.

Graph DBs store and retrieve in a manner arguably more congruent to the true structure of heavily relational data than an RDBMS.

Using an RDBMS with foreign keys and joins can mean a significant performance cost in join-heavy situations.

There are many products in the graph database space, many of them relatively new. There are some variations in features and intended niche. I focus on Neo4j, which is the dominant player, mature, and open source.

Neo4j

Neo4j seems to be the most prominent and heavily used graph database product of the "property graph" type. Its sponsor is a company named Neo Technology. It was created in 2003 and open-sourced in 2007. It's under active development, but seems mature enough not to be undergoing disruptive changes. There's an active user community and a good ecosystem of third-party tools, and books are emerging as well.

Querying and Data Access

Cypher

Cypher is Neo4j's SQL-ish declarative query language.

One notable difference from SQL is that every database query has an explicit starting point. Usually this is a specific node in the graph. The Cypher START clause identifies this node. It's selected either by its ID or via an index lookup.

For example, given that almost any $BIGCMS object is attached to a specific site (or sites), many queries of graph-database $BIGCMS might start at a site node.

A common pattern for Cypher queries is START ... MATCH ... RETURN. (Keywords are not case sensitive, but as with SQL it improves overall query readability if they are in all caps.)

Cypher session example ("//" begins a comment):

    // A mutating operation (e.g. CREATE) doesn't have to return anything, but it can.
    // Note that we did not have to declare our nodes' data structure before creating them.
    $ CREATE paper={name:"AJC"}, tv={name: "WSB TV"}, radio={name: "WSB radio"} RETURN paper, tv, radio
    ==> +-----------------------------------------------------------------------------+
    ==> | paper                | tv                      | radio                      |
    ==> +-----------------------------------------------------------------------------+
    ==> | Node[17]{name:"AJC"} | Node[18]{name:"WSB TV"} | Node[19]{name:"WSB radio"} |
    ==> +-----------------------------------------------------------------------------+
    ==> 1 row
    ==> Nodes created: 3
    ==> Properties set: 3
    ==> 3 ms

    // Establish the relationships, fetching start nodes by ID
    $ START tv=node(19), radio=node(18) CREATE tv-[:SAME_MARKET]->radio
    $ START tv=node(19), paper=node(17) CREATE tv-[:SAME_MARKET]->paper

    // Query the graph; "-" indicates relations, with optional "<" or ">" for direction
    $ START a = node(18) MATCH a-[:SAME_MARKET]-b RETURN DISTINCT b
    ==> +----------------------------+
    ==> | b                          |
    ==> +----------------------------+
    ==> | Node[17]{name:"AJC"}       |
    ==> | Node[19]{name:"WSB radio"} |
    ==> +----------------------------+

The Cypher relation syntax looks a bit noisy at first; it's helpful to think of it as a sort of ASCII-art diagram; "a-->b" or "a<--b" or "a-[:LOVES]->b" or "b-[:TOLERATES]->a" are all legal.

Other access modes

In addition to the declarative-style Cypher, there are other supported ways to access data.

The server has a REST API. In addition to being available for "raw" use it is the basis for many of the tools and language bindings for Neo4j. For example, the provided Python bindings utilize the REST API internally.

The Neo4j shell, in addition to supporting Cypher commands, has utility functions that make interactive manipulation of graph data easier.

Gremlin is a graph traversal language based on Groovy ("the Python of Java"). It's provided as a plugin with the Neo4j distribution.

There's also py2neo, a comprehensive Python library for Neo4j access that also provides submodules for access via Cypher, Gremlin, Geoff (a graph modeling language by the same author), and raw REST.

Using Neo4j

The Neo4j "Community" version is what we would likely use. It's GPL licensed, and is the complete product.

They also offer two commercial versions, "Advanced" and "Enterprise." The selling points are advanced monitoring features, high availability support, a specialized web management console, and support services.

(The Advanced and Enterprise versions are also available under an Affero GPL license, but this is currently not practical for us.)

The user support ecosystem is what you would expect for an open source project. There's an official Google Group. Using Stack Overflow to ask questions is encouraged. There's a (quiet) IRC channel on Freenode. Github is used to distribute the source.

Scaling

Scaling a Neo4j database is not as simple as with a Dynamo-style store like Riak. Graphs are difficult to shard.

Neo4j has "high availability" features for clustering in the Neo4j Enterprise Edition. This is a master-slave setup. You can write to master or slave nodes, though there's a speed penalty for writing to slaves. All nodes get all writes eventually. Automatic fail-over can be set to elect any cluster member as master. A failed master node can later re-join as slave if desired.

In a cluster setup, backups can be performed by adding a slave to the cluster, which will pick up all the data. To restore, you stop the cluster, restore data from backup to at least one node, and re-start the cluster.

Neo Technology has been working for several years now on a system allowing the graph datastore to be distributed across servers, and to be scaled horizontally. This work (currently known as "Rassilon") will arrive with Neo4j 2.0 at the earliest (current stable version is 1.8).

Technical details

Neo4j is a JVM application (written in Java and Scala), so we would need to cultivate expertise in JVM deployment.

Neo4j likes to have its data in RAM -- specifically its node and property maps, which are mostly pointers. Having space to additionally hold the full property values in RAM is apparently not critical. Given that the vast bulk of $BIGCMS data is in property values, and that the total number of records (i.e. nodes) is nowhere near their hard limit of 32 billion, this seems achievable.

For best performance, Neo recommends maximizing the host OS's file caching. Making the server's filesystem cache size as big as the entire datastore is recommended when possible.

Their JVM tuning advice is: give the JVM a large heap that will hold as much application data as possible, but also make sure the heap fits in RAM to avoid performance degradation from virtual memory paging. Along those lines Neo advises tuning Linux to be more tolerant of dirty virtual memory pages.

Installation

Ubuntu/Debian: Neo Technology provides an apt repository.

OS X: There's a Homebrew formula for the latest stable version of Neo4j.

Other Unix platforms (e.g. CentOS): Neo Technology provides tarballs containing the full binary release. And the source is available too of course.

Suitability

Graph database technology proponents make a big deal of how well suited it is to relationship-heavy social media applications. While that's not currently a big niche for us, the technology still has some appeal.

One only needs to look at some of $BIGCMS's slowest, join-heavy SQL queries to know that a graph approach has the potential to increase performance greatly, and perhaps allow us to work with data in some ways that we have avoided or ignored because they are impractically slow.

And for our goal of "store structured data, not presentation," a graph database seems like an excellent fit. Graph relationships would give us the ability to record even more (readily usable) structure than we already do.

Final Thoughts

We could certainly speed up many slower $BIGCMS queries by moving from a RDBMS to a graph system. Our most pathologically slow SQL queries can take minutes. Getting our data into graph storage could eliminate many if not all of these.

However, the migration effort would be significant. Getting $BIGCMS data into graph form will require some careful thinking about how the data will be accessed. Common advice on creating a graph store is to think about the relationships first. This might lead to some rethinking of how we store data.

Since a major goal of $BIGCMS is to share content across sites, and we intended to build a library of that content, a graph database could offer a natural and powerful way to work with those connections.

If we were intending to directly replace our RDBMS store with a graph database, many migration challenges would arise that we might not see with other data store types. But since the our data store will live behind a REST API, disruption at the application level might be no greater than with some data store type (e.g. key-value).

As a more detailed design for the data store REST API is developed, we will likely have a better sense of how a graph database would serve in that design, and how its advantages would be felt.

Resources

O'Reilly is working on a Graph Databases book which is currently available in a free pre-release PDF at http://graphdatabases.com/. It heavily emphasizes Neo4j.

Manning is publishing "Neo4j in Action" which is currently available under their Early Access Program.

Saturday, September 16th, 2017
+ +
1 comment

How did I get here?

(I recently posted this on Quora in response to a question along the lines of "Engineers, when did you decide to study Computer Science?")

I have been a full-time software engineer for the last 7 years, and a part-time one for ten years before that.

I have never formally studied computer science.

It wasn’t an option before college (small high school in rural Vermont). And at the otherwise excellent small liberal arts college I attended, it wasn't one of the available majors.

I learned to program because that was the most interesting thing to do with a computer in the 1980s. Every computer shipped with BASIC. I wrote a lot of programs. I played with every computer I could. I taught computer workshops at the local college. I read computer magazines and manuals.

In college I took the few classes that were available, including one with a disciple of Niklaus Wirth. I learned C, Modula-2, and Prolog. On the side I taught myself some assembly language, FORTRAN, and Forth.

When the web arrived in the ’90s, I started making pages; eventually I realized I could apply my old programming skills to this new medium, and started learning PHP.

After a few years of this I wanted to get more rigorous; I started learning additional languages (most notably Python) and good software engineering practices. I’ve done a few online classes and stints of self-study to help fill in the gaps in my education.

I’ve become a big fan of functional programming. I imagine I would have loved taking a Programming Languages class as part of a CS degree.

These days my focus is not so much on CS, but on being a good engineer: expanding my ability to design, develop, and maintain mission-critical web applications. That’s what I talk about when I guest-lecture for classes in our regional university’s CS department.

That’s my story.

Saturday, July 22nd, 2017
+

The Riak key-value database: I like it

(Note: This is a writeup I did a few years ago when evaluating Riak KV as a possible data store for a high-traffic CMS. At the time, the product was called simply "Riak". Apologies for anything else that has become out of date that I missed. Also please pardon the stiff tone! My audience included execs who we wanted to convince to finance our mad scientist data architecture ideas.)

Riak is a horizontally scalable, fault-tolerant, distributed, key/value store. It is written in Erlang; the Erlang runtime is its only dependency. It is open source but supported by a commercial company, Basho.

Its design is based on an Amazon creation called Dynamo, which is described in a 200-page paper published by Amazon. The engineers at Basho used this paper to guide the design of Riak.

The scalability and fault-tolerance derive from the fact that all Riak nodes are full peers -- there are no "primary" or "replica" nodes. If a node goes down, its data is already on other nodes, and the distributed hashing system will take care of populating any fresh node added to the cluster (whether it is replacing a dead one or being added to improve capacity).

In terms of Brewer's "CAP theorem," Riak sacrifices immediate consistency in favor of the two other factors: availability, and robustness in the face of network partition (i.e. servers becoming unavailable). Riak promises "eventual consistency" across all nodes for data writes. Its "vector clocks" feature stores metadata that tracks modifications to values, to help deal with transient situations where different nodes have different values for a particular key.

Riak's "Active Anti-Entropy" feature repairs corrupted data in the background (originally this was only done during reads, or via a manual repair command).

Queries that need to do more than simple key/value mapping can use Riak's MapReduce implementation. Query functions can be written in Erlang, or Javascript (running on SpiderMonkey). The "map" step execution is distributed, running on nodes holding the needed data -- maximizing parallelism and minimizing data transfer overhead. The "reduce" step is executed on a single node, the one where the job was invoked.

There is also a "Riak Search" engine that can be run on top of the basic Riak key/value store, providing fulltext searching (with the option of a Solr-like interface) while being simpler to use than MapReduce.

Technical details

Riak groups keys in namespaces called "buckets" (which are logical, rather than being tied to particular storage locations).

Riak has a first-class HTTP/REST API. It also has officially supported client libraries for Python, Java, Erlang, Ruby, and PHP, and unofficial libraries for C/C++ and Javascript. There is also a protocol buffers API.

Riak distributes keys to nodes in the database cluster using a technique called "consistent hashing," which prevents the need for wholesale data reshuffling when a node is added or removed from the cluster. This technique is more or less inherent to Dynamo-style distributed storage. It is also reportedly used by BitTorrent, Last.fm, and Akamai, among others.

Riak offers some tunable parameters for consistency and availability. E.g. you can say that when you read, you want a certain number of nodes to return matching values to confirm. These can even be varied per request if needed.

Riak's default storage backend is "Bitcask." This does not seem to be something that many users feel the need to change. One operational note related to Bitcask is that it can consume a lot of open file handles. For that reason Basho advises increasing the ulimit on machines running Riak.

Another storage backend is "LevelDB," similar to Google's BigTable. Its major selling point versus Bitcask seems to be that while Bitcask keeps all keys in memory at all times, LevelDB doesn't need to. My guess based on our existing corpus of data is that this limitation of Bitcask is unlikely to be a problem.

Running Riak nodes can be accessed directly via the riak attach command, which drops you into an Erlang shell for that node.

Bob Ippolito of Mochi Media says: "When you choose an eventually consistent data store you're prioritizing availability and partition tolerance over consistency, but this doesn't mean your application has to be inconsistent. What it does mean is that you have to move your conflict resolution from writes to reads. Riak does almost all of the hard work for you..." The implication here is that our API implementation may include some code that ensures consistency at read time.

Operation

Riak is controlled primarily by two command-line tools, riak and riak-admin.

The riak tool is used to start or stop Riak nodes.

The riak-admin tool controls running nodes. It is used to create node clusters from running nodes, and to inspect the state of running clusters. It also offers backup and restore commands.

If a node dies, a process called "hinted handoff" kicks in. This takes care of redistributing data -- as needed, not en masse -- to other nodes in the cluster. Later, if the dead node is replaced, hinted handoff also guides updates to that node's data, catching it up with writes that happened while it was offline.

Individual Riak nodes can be backed up while running (via standard utilities like cp, tar, or rsync), thanks to the append-only nature of the Bitcask data store. There is also a whole-cluster backup utility, but if this is run while the cluster is live there is of course risk that some writes that happen during the backup will be missed.

Riak upgrades can be deployed in a rolling fashion without taking down the cluster. Different versions of Riak will interoperate as you upgrade individual nodes.

Part of Basho's business is "Riak Enterprise," a hosted Riak solution. It includes multi-datacenter replication, 24x7 support, and various services for planning, installation, and deployment. Cost is $4,000 - $6,000 per node depending how many you buy.

Overall, low operations overhead seems to be a hallmark of Riak. This is both in day-to-day use and during scaling.

Suitability for use with our CMS

One of our goals is "store structured data, not presentation." Riak fits well with this in that the stored values can be of any type -- plain text, JSON, image data, BLOBs of any sort. Via the HTTP API, Content-Type headers can help API clients know what they're getting.

If we decide we need to have Django talk to Riak directly, there is an existing "django-riak-engine" project we could take advantage of.

TastyPie, which powers our API, does not actually depend on the Django ORM. The TastyPie documentation actually features an example using Riak as data store.

The availability of client libraries for many popular languages could be advantageous, both for leveraging developer talent and for integrating with other parts of the stack.

Final thoughts

I am very impressed with Riak. It seems like an excellent choice for a data store for the CMS. It promises the performance needed for our consistently heavy traffic. It's well established, so in using it we wouldn't be dangerously out on the bleeding edge. It looks like it would be enjoyable to develop with, especially using the HTTP API. The low operations overhead is very appealing. And finally, it offers flexibility, scalability, and power that we will want and need for future projects.

Friday, February 24th, 2017
+ + + +

Quora questions I've seen enough of

I really do like Quora (you may have seen my SadQuora tweets, a side effect of the time I spend there). But when somebody asked, "What are the most annoying types of questions on Quora?" I couldn't resist. Maybe it's just my feed, but I see things like these a lot:

Tuesday, October 4th, 2016
+ +