Monday, March 30, 2009

Clojure + Terracotta Update

I have gotten to the point in my Clojure + Terracotta experiment, where I believe all of the features of Clojure are functional (Refs, Atoms, transactions, etc.). I do not have a way to extensively test the Clojure functionality, but I have run the clojure.contrib.test-clojure test suites successfully, as well as some simple tests on my machine.

There are still some open issues, and given the limited extent to which I have tested this, I would not consider this production quality in the least. I would welcome help from the Clojure community in testing this integration module. I'm sure there are unexplored corners.

Being that several of the changes are relatively trivial, they could be easily integrated back into the Clojure core. I have detailed as best as possible the changes I had to make to Clojure in this report: Clojure + Terracotta Integration Report

The code is available at GitHub (http://github.com/pjstadig/tim-clojure-1.0-snapshot/tree/master), and there are instructions on setting it up and running the code. If you have any difficulties or questions, please feel free to e-mail me paul@stadig.name

Thursday, March 5, 2009

Clojure + Terracotta: We Have REPLs!

Update: The Clojure + Terracotta integration is (I believe) feature complete. Details at http://paul.stadig.name/2009/03/clojure-terracotta-update.html.

JVM #1

paul@pstadig-laptop:~/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything$ ./bin/dso-clojure repl.clj 
Starting BootJarTool...
2009-03-05 15:00:10,868 INFO - Terracotta 2.7.3, as of 20090129-100125 (Revision 11424 by cruise@su10mo5 from 2.7)
2009-03-05 15:00:11,428 INFO - Configuration loaded from the file at '/home/paul/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything/tc-config.xml'.

Starting Terracotta client...
2009-03-05 15:00:14,904 INFO - Terracotta 2.7.3, as of 20090129-100125 (Revision 11424 by cruise@su10mo5 from 2.7)
2009-03-05 15:00:15,436 INFO - Configuration loaded from the file at '/home/paul/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything/tc-config.xml'.
2009-03-05 15:00:15,656 INFO - Log file: '/home/paul/terracotta/client-logs/org.terracotta.modules.sample/20090305150015636/terracotta-client.log'.
2009-03-05 15:00:17,870 INFO - Statistics buffer: '/home/paul/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything/statistics-127.0.1.1'.
2009-03-05 15:00:18,421 INFO - Connection successfully established to server at 127.0.1.1:9510
user=> (defn foo [] 42)
#'user/foo
user=>

JVM #2

paul@pstadig-laptop:~/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything$ ./bin/dso-clojure repl.clj
Starting BootJarTool...
2009-03-05 15:01:39,663 INFO - Terracotta 2.7.3, as of 20090129-100125 (Revision 11424 by cruise@su10mo5 from 2.7)
2009-03-05 15:01:40,225 INFO - Configuration loaded from the file at '/home/paul/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything/tc-config.xml'.

Starting Terracotta client...
2009-03-05 15:01:45,507 INFO - Terracotta 2.7.3, as of 20090129-100125 (Revision 11424 by cruise@su10mo5 from 2.7)
2009-03-05 15:01:46,091 INFO - Configuration loaded from the file at '/home/paul/tim-clojure/tim-clojure-1.0-SNAPSHOT/examples/shared-everything/tc-config.xml'.
2009-03-05 15:01:46,275 INFO - Log file: '/home/paul/terracotta/client-logs/org.terracotta.modules.sample/20090305150146254/terracotta-client.log'.
2009-03-05 15:01:50,868 INFO - Connection successfully established to server at 127.0.1.1:9510
"user=> "(foo)
42
"user=> "*print-dup*
false
"user=> "

Commentary

This is obviously an example of the "shared everything" approach. It's neither perfect nor complete, but it's a start. There are still some non-portable classes that need to be reworked, for some reason the second VM is printing the REPL prompt readably even though *print-dup* (as you can see) is false, I still haven't worked out the problem with *in*, *out*, and *err*, etc. etc.

It's still very raw, but I'll see if I can't push to github in the next day or two. This is an exciting first step!

Tuesday, March 3, 2009

Clojure + Terracotta: The Next Steps

Update: I've gotten a *multiple* REPLs running with Terracotta. http://paul.stadig.name/2009/03/clojure-terracotta-we-have-repl.html.

In my last post about Clojure + Terracotta I gave an example of sharing specific references between JVMs through Terracotta. This is what I call the "shared somethings" approach. You specify exactly what you like to share. Another approach is what I call the "shared everything" approach.

Shared Everything

The goal of shared everything is to have multiple VMs working within a single global context through Terracotta, all of your vars and refs would be shared by default, and the canonical test case for this would be to define a function in one VM and have it show up automatically in another VM.

The first task was to move my work into a Terracotta Integration Module (TIM). When using a TIM, in addition to packaging the configuration for reuse, classes can be replaced with clustered versions that will work with Terracotta, without having to fork the original code base.

The second was to replace a couple of classes in the Clojure codebase. The Namespace class uses AtomicReference, which is not supported by Terracotta. There was a minor change necessary in Var, too, because it was using its dvals field as a sentinel value to indicate that the var is not bound. This does not do for Terracotta, because dvals is a ThreadLocal, so I created a NOT_BOUND sentinel field. There were some other changes as well, I'm not going to detail all of the changes, but you get the idea.

At this point I would have hoped that I could run a REPL and possibly even try my canonical test case, but it should be so easy. I have run into two major roadblocks:

  1. *in*, *out*, and *err*. Being I/O streams, *in*, *out*, and *err* obviously cannot be shared through Terracotta. The problem is that they are stored as Vars and interned into the clojure.core namespace. This means that Terracotta will try to share them, because they are part of the shared object graph. I could make clojure.lang.Var.root a transient field (through Terracotta's configuration file), but that would make the roots of all Vars transient, which is not what we want. Instead, what I thought I needed is some kind of TransientVar that could have a different root value (not just bindings) for each JVM. I pursued this a bit using the class replacement of the TIM, and concluded that if that is the route to go, then it should probably be made in the Clojure code (it got messy), or that at least there are some changes to the Clojure code that would ease this. What I settled on (after a suggestion from Rich) is to leave the root bindings for *in*, *out*, and *err* as nil and allow the REPL to bind them. However, the REPL did not bind them for me, so I created my own repl.clj file that binds them and calls clojure.main/repl, and it works! However, this is only a temporary solution. Whether it is creating a TransientVar class, or something else, we need a more permanent solution.
  2. Classes. I am able to connect a single JVM to Terracotta, and run the REPL, but I cannot connect multiple JVMs, nor can I disconnect and reconnect a single JVM. When an instance of Clojure connects to Terracotta, it pulls a compiled function out of the object cache, and then throws a ClassNotFoundException because it cannot find the associated class. I started to pursue modifying the DynamicClassLoader and Compiler to store the compiled classes in the Terracotta object graph, and I still think that this might be the direction to go in. However, I wanted to go ahead and share what I have and get some feedback to see if there are any other solutions.

The code is available at http://github.com/pjstadig/tim-clojure-1.0-snapshot/tree/master. In the "examples" directory I have a "shared-everything" example and a "shared-somethings" example. If you have any trouble running the examples, then let me know. There are some dead ends and some commented code that may not make sense, but my goal was to do a proof-of-concept first and to clean it up once I understand what needs to be done.

Conclusion

We are getting close to a "shared-everything" approach to integrating Clojure and Terracotta. We have some issues to deal with, but we are on our way to making this dream a reality.