Notes from the development meeting
michelle.dsouza at utoronto.ca
michelle.dsouza at utoronto.ca
Thu Dec 3 20:18:16 UTC 2009
Hi everyone,
I took some rough notes at today's meeting. I hope others who were at
the meeting will add to and correct this.
Michelle
Kettle
In order to use Kettle for Decapod there are two things that need to
be done. We require the ability to make system calls and we also
require Kettle to be untangled from Engage.
Our requirement for system calls come out of needing to start command
line processes, determine the status of a command line process that
we've started and stop processes. We may also need to queue long
running processes rather then run them all in parallel.
Two possible candidates that have been mentioned are Quartz
http://www.quartz-scheduler.org/ and libexpect
http://www.tcl.tk/man/expect5.31/libexpect.3.html
One issue we talked about at the meeting was that we need to think
carefully about what we can take with us when we move from rhino to V8.
We likely don't need something at the level of quartz or libexpect
since our needs are so simple.
The best candidate for porting from rhino to V8 is something written
in javascript. The suggestion was made that perhaps we should write
our own despite the maintenance costs since we have such simple needs.
The current plan for Decapod's 0.1 is to write our own extremely
simplistic system call support - just enough to port Decapod from
Cherry Py to Kettle. This allows us to defer the decision about a
third party package. We will also continue to look for alternatives.
As far as splitting out Kettle and Engage, I sent a detailed message
to the list about what needs to be done and the goals for the work.
The one issue that I hadn't mentioned was what to do with Kettle
dependencies. We decided that the most reasonable approach was to have
two configuration files - one for Kettle and one for the application.
The Kettle configuration would specify all of Kettle's dependencies
and the application config would specify all of its dependencies.
Kettle would compare the two to ensure we are not loading dependencies
twice.
Date Picker
The collection space work is currently in need of a date picker. Yura
has been looking at the jQuery date picker but unfortunately has hit
some accessibility issues with it. The biggest problem seems to be
keyboard behaviour - it is impossible to determine what you are
changing via the keyboard until the change has been done.
Yura is going to look at other date pickers such as YUI's and
Google's. He's also going to talk to Erin about what the long term
requirements of date selection are. One issue to keep in mind is
support for fuzzy dates such as 'circa 1900' and 'paleolithic era'.
There is also a concern about the actual format the date is stored in.
Databases
We are currently working on two different levels when it comes to data
access. We are planning architecture and brainstorming about how we
handle and organize data in a database and in data feeds and how that
will be internalized in the framework. At the same time we are working
on a practical level - we need to get data out of and into CouchDB.
We are starting to see some short term implementations being built and
we are continuing to think about the long term plan.
In our demo we have data from MMI and McCord and we have couch set up
to have each museum's data in their own database. Sveto followed that
lead and made users in their own database too. This work speaks to the
future ability to federate users across different museum systems.
One thing we need to keep in mind is that there may be groups who want
to collaborate but who don't want their data to be housed on the same
server. We also need to be careful not to think of Couch as just a
database. It's also a set of data feeds. It's on the levels of the
feeds and APIs that we need to think about connections between data.
Going forward it seems that we should consider having all the data in
a particular Engage instance live in the same database in CouchDB.
One issue we talked about was how we would distinguish data that came
to us from a museum source from data that we collected.
One of the advantages of a schema-less database is that it enables
bi-directional flow of data. Our data needs to go back up stream.
We will likely have 'shadow documents' in the system. For example an
artifact, like the Spock Decanter, wouldn't have a single document in
the database. There would be at least two - one from the museum and
one containing data generated from Engage. In fact, there will be
museums who give us access to several sources of data so there is a
possibility of a single artifact having several documents whose data
will be merged before being shown to the user. We can use views in
couch to combine the data or perhaps some other implementation.
Concretely, thinking about the data that we currently have, a
collection would be a document and an artifact would also be a
document which would contain an array of comments. A user would also
be a separate document. Here is a rough sketch of what a collection
may look like. Note that there are 3 documents represented here whose
data would be used when rendering a collection.
michellesCollection = {
id: 12345,
name: "Michelle's collection",
user: "michelle at dsouza.org",
comments: ["This collection rules!", "Me too!"],
artifacts: [6789, 1234]
}
artifact = {
id: 6789,
name: "Left handed screwdriver",
}
shadowArtifact = {
id: 6788,
inCollections: [12345, 657],
comments: ["This is not very different from a right handed screwdriver"]
}