Hello (Distributed) World: Designing Software that Spreads P2P

What if simply being connected to a LAN or even the Internet gave you access to any application ever run on any other computer? What if your computer could automatically retrieve and launch those applications, no matter how complex, using a simple hashed-based identifier? What if the right software just arrived at the right time?

Answering these questions is part of our goal with Skynet1 and Spin, the distributed programming language on which Skynet is built. When a user needs a new Spin application, Skynet seamlessly retrieves it from a peer on the local network or the Internet.

To make this possible, applications written in Spin are structured a little differently.

Hash everything

To start with, every part of a Spin application is identified by its SHA-1 digest. We refer to these identifiers as uuids.2 These uuids let Skynet locate and download parts of applications over our peer-to-peer network and trust the results without needing to trust other peers.

Portrait of a fluid application

A Spin application is broken into several chunks. The most important of these is the application’s descriptor. This descriptor is, in some sense, the head of the application. In order to spawn an application, the spawning process must have a copy of the application’s descriptor.

However, the descriptor doesn’t contain any code. Instead, it contains information such as the name and version. It can even include a description or icon. Most importantly, it contains a list of required dependencies specified by uuid.

An application will not start unless all of its dependencies are available. These dependencies may be any kind of resource. One of them will be its own code segment.

The uuid of the descriptor itself uniquely identifies the entire application.

Save space, save bandwidth

Like dynamically linked native code, our dependency system saves space by storing each library or resource at most once, no matter how many applications depend on it. Just as importantly, each dependency will only ever be downloaded once.

Dependable dependency resolution

Like a statically linked executable, a Spin application will always use the exact dependencies it was compiled against. A library cannot be updated out from under an application and introduce unexpected behavior (or new bugs). If an application depends on a buggy library, it must be re-released in order to take advantage of a patched version.

Using exact dependencies was a difficult decision. Tight and loose coupling both have advantages. In general, modern operating systems encourage dynamic linking. This can help them conserve space and RAM while making system updates easier for conventional software.

Our hash-based distribution model already provided those benefits. In the end, we chose exact dependencies because it made our software easier to reason about and seemed to produce more reliable applications.3

Hello world as a fluid application

The simplest Spin application is a folder containing two .spin source files.

hello-world/
  main.spin
  metadata.spin

First metadata.spin is compiled and then run, returning a partial application descriptor. Then, all spin files except metadata.spin are compiled and included in the code segment.

Creating the descriptor

Here’s a simple metadata.spin file:

'{ type: 'application }

As you can see, you don’t have to specify much.4 The compiler automatically adds other basic fields. It infers the application name from the directory and calculates the dependency uuids during compilation. The dependencies are stored under the “required-uuids” key and the “data-uuid” key specifies which of them is the code segment.

'{
  commit-status: "later"
  commit-timestamp: 1281558040
  data-uuid: "0a0ac5781c34c69dac3996103c540e265ec81e67"
  imported-metadata-uuids: '{}
  major-version: 0
  micro-version: "2010.223.73240"
  minor-version: 0
  modification-timestamp: 1281558040
  name: "hello-world"
  payloads: '{}
  required-sizes: '{0a0ac5781c34c69dac3996103c540e265ec81e67: 276}
  required-sizes-total: 276
  required-uuids: '["0a0ac5781c34c69dac3996103c540e265ec81e67"]
  type: "application"
  version: '[0, 0, "2010.223.73240", "later", 0]
}

The moment we’ve all been waiting for…

All applications have a spawn entry point that defaults to “main.” That is, when spawned, execution begins at the top of the main.spin file.

Log.as('green, "Hello distributed world.")

And that’s the simplest fluid application.5 Like all Spin applications, this version of the “hello-world” application is uniquely identifiable by its uuid.

3bd42a425d8b3d45d569527b88ec87f040db6257

Those 40 characters describe a complete application, waiting to be fetched from the network and launched on any computer running Skynet.

Skynet Logo

Learn more about Skynet and help make it great.

  1. Skynet is peer-to-peer desktop software designed to connect computers directly to each other so that their owners can seamlessly share media with friends and collaborate with colleagues. 

  2. This differs from IETF definition. Unlike IETF Version 5 UUIDs, we do not truncate the value to 128 bits. In addition, we will be transitioning to SHA-256 in the future. To distinguish the two definitions, we write ours as “uuid” in all lowercase. 

  3. Of course, not all relationships are suitable for exact linking. For example, communication with system services happens over loosely coupled messaging. These relationships must be declared and versioned to allow system upgrades. Applications whose service version requirements cannot be met, cannot be run. 

  4. Applications can be spawned, unlike “libraries,” but are always started explicitly, unlike “services.” 

  5. Spin does support writing to standard out, but in practice logging is almost always better. Consider that many running Skynet instances have no terminal attached to their standard out, but all support remote log viewing over our peer-to-peer network. 


If You Can’t Make It Fast, Make It Parallel

Building your own programming language is a lot of work, and you can’t do everything at once. Sometimes you know exactly how to make your language better, but other considerations take priority.

We know how to make our programming language Spin1 a heck of a lot faster, but working on its first killer app Skynet, is more important. Which brings me to a confession…

We’re no faster than Ruby 1.8

I love Ruby, but I consider it an established low water mark for language performance. Ruby is just fast enough to use for real work, but developers can and will complain about performance. As a language designer you really want to be at least a little faster than Ruby 1.8.2

Fear not! We built Spin for distributed programming. And once we had applications talking to applications over asynchronous messages, we started to want modules within those applications to communicate the same way.3

Instead of building objects that have methods, sometimes we built threads that respond to messages. Before we knew it our language was designed around lightweight threads and shared-nothing message passing.

Spin could be faster, but in the meantime, it’s got a saving grace.

An idle Skynet has 160 green threads

It spikes way higher than that while doing active work.4 All of those green threads are mapped to a smaller number of native threads. Unlike some other languages, Spin actually get faster every time Intel or AMD releases on new multi-core monster… if you run it on that new computer.

So our language isn’t a speed demon, but software written in it scales beautifully.5 It’s an interesting position to be in. Skynet completely pegs both cores of my Macbook Pro when starting, but it boots twice as fast as if we’d gone with a more conventional language design.

I can’t wait for Spin to be faster, but in the meantime it’s good enough that it’s parallel. When we improve our language performance, the threads running on every core get faster. And if numbers of cores grows faster than megahertz, well, then I’ll really be glad we went parallel.6

Skynet Logo


  1. Spin is a horribly confusing codename for our language since there are so many other projects already using that name. We promise to rename it as soon as we have a million developers (or sooner ;-). 

  2. Ever since Ruby 1.9, Ruby itself is now faster than Ruby 1.8. Go Ruby! 

  3. More on our slide into distributed programming later. 

  4. Virtually all of the 160 steady state green threads are in a wait state, so CPU consumption is minimal. And since green threads are relatively cheap, spinning up more isn’t a big deal either. 

  5. Not to mention that IO is performed in specially allocated native threads, so you never accidentally block a UI update or working thread just because you’re reading a large file. 

  6. Today, even a decent desktop has four cores. Starting in August you’ll be able to order a Mac Pro with twelve cores and SMT for 24 threads of execution. Imagine how much faster Spin would be on one of those! 


Practical Ruby Projects

My Ruby book Practical Ruby Projects: Ideas for the Eclectic Programmer is in stores!


(Practical Ruby Projects)

I’ve been writing for the past year, and I’m crazy about the result. This is the sort of book that I love to read, and hopefully that shines through.

Practical Ruby Projects is based on a few ideas:

  • “Hands on” is always better.
  • You can cover more if you trust your reader to know the basics.
  • Programming can be creative and exciting.
  • Getting started is hard. Anything that makes that easy is good.

This book covers a lot of projects (all done in Ruby, of course!):

  • Animation
  • Music
  • Simulation
  • Turn Based Startegy Games
  • RubyCocoa
  • Genetic Algorithms
  • Implementing Lisp
  • Parsing

PS- All the source code is available under the MIT license!


Making Ruby into PHP

Sometimes a web framework is just over kill. Maybe it’s a one off dynamic page, maybe you don’t want the memory footprint of a whole framework running for only a small component of your whole site.

mod_ruby and eruby can get you a lot of what PHP gives you if you’re into that and don’t mind the setup. There’s the added perk that you won’t have to write PHP.

But what if you want something similar, but quick and simple and you’re willing to use CGI? Here’s a neat little trick.

First save this text into a file called “rubyhp.rb” in your cgi-bin directory.

require 'erb'
require 'cgi'

cgi = CGI.new
print "Content-type: text/html\n\n"
print ERB.new(DATA.read).result(binding)

Leave this file unexecutable, so your web server won’t serve it. Then create your PHP style Ruby file like this.

#!/usr/bin/ruby                                                                 
require 'rubyhp'
__END__
<html>
 <body>
  <% cgi.params.each do |key, value| %>
   <%= key %>: <%= value %><br />
  <% end %>
  <% if cgi.params.empty? %>
   Sorry, please enter some cgi parameters. How about "?foo=baz"?
  <% end %>
 </body>
</html>

Name this file “test.rb” and save it in your same cgi-bin directory. Make it executable and you’re done. You can write whatever erb you want after the __END__ line, without needing to worry about setting up anything fancy. And you have access to a parsed CGI object through the variable ‘cgi’.

Does CGI still have a place in modern web work? I’m don’t know. But I do still use it for quick dynamic pages on cyll.org like the Computer Science Bandname Generator. This little trick makes life a little easier for me when I do.


Stupid IRB Tricks

This post reflects some tricks I sent out in a mail to PDX.rb and some others I posted on my Intel internal Ruby blog. There’s nothing terribly novel here, but if you haven’t stumbled on these yet, they might save you some time.

Your .irbrc file gives you a lot of control over what your IRB looks like each time it starts. Here’s what my .irbrc file looks like:

require 'irb/completion'
ARGV.concat [ "--readline", "--prompt-mode", "simple" ]

class Object
  def mymethods
    (self.methods - self.class.superclass.instance_methods).sort
  end
end

The first two lines turn on tab completion. If you don’t have this on already, turn it on now! It only works when IRB can figure out the type of expressions, but it helps make the interpreter more friendly. Type []. and hit tab to see the array methods tab complete.

The second chunk sets up an useful introspection function that Ben and I came up with. It allows me to type foo.mymethods and get only the methods that are defined for foo, but not the methods defined in its superclass, which I find is often what I want (and sorted!). This is important because sometimes Ruby’s humane interface means the number of methods can be a little overwhelming.

Oh, and one other tip! If you’re doing interactive shell scripting in irb, you’ll often get into situations where the huge list of files you’re copying, or text-replacing or whatever is printing after every command since most Ruby commands return values. Waiting for hundreds of lines to print after each command can start to drive you nuts. Thankfully, you can shut off this echoing behavior in irb by typing:

conf.echo = nil

Those are all the tricks I can think of, but there must be more out there…


Comp Sci Bands

Can’t think of a name for your new Computer-Science-Rock Band? Try the Geek Chic Bandname Generator. It produces gems like…

  • Bjarne Stroustrup and the Harvard Architectures
  • Alan Kay and the Loop Invariants
  • James Gosling and the Traveling Salesmen
  • Alonzo Church and the Regular Expressions
  • Donald Knuth and the Dining Philosophers

A Quine Is A Quine Is A Quine

After re-reading Ken Thompson’s Reflections on Trusting Trust, I decided to write a Ruby quine.

A quine is a “program that generates a copy of its own source text as its complete output.” I decided to use the classic Lisp quine as a foundation.

(lambda (x) (list x (list (quote quote) x))) (quote (lambda (x) (list x (list (quote quote) x)))))

This is what I came up with:

def x(s); puts %Q{#{s} x(%q{#{s}})}; end; x(%q{def x(s); puts %Q{#{s} x(%q{#{s}})}; end;})

The Quine Page doesn’t have any Ruby Quines, but there are a bunch on the Ruby Quines page. If you remove the extra spaces from my quine, it’s 84 characters. The shortest Ruby quine known is 28 characters:

puts <<2*2,2
puts <<2*2,2
2

Pretty cute!


Page 1 of 1