My Collection ENGAGE-418 patch
Colin Clark
colinbdclark at gmail.com
Mon Feb 22 22:56:16 UTC 2010
Hi Sveto,
Thanks for the fix. It looks like you've rolled several fixes into one patch here, which is awkward. It really makes review and verification much more difficult when there is more than one issue in a single patch, since any particular fix will have side effects.
Looking at your patch, and your description of what you've done, you've actually addressed at least three issues, not just ENGAGE-418:
1. Increased the number of documents returned from Couch lucene for My Collection, fixing a bug that limits a user's My Collection to 25 artifacts
2. Reorders a user's My Collection to return them in collection order, not in whatever default order Lucene returns them in
3. ENGAGE-418, where users see artifacts in different languages due to the fact that we query them via UUID rather than by museum-specific ID.
Did you, by chance file JIRAs for #1 and #2?
Since we're doing targeted bug fixes, it's risky to fix multiple issues without first asking and getting them added to the bug parade. In the future, it would be hugely helpful if you can break multiple issues up into separate tickets and separate patches.
So, what next?
James agreed that switching between languages mid-stream is not a primary use case, so we can afford to delay on fixing this one. I think it's still a bug that we're collecting by UUID instead of museum-specific ID and language, but we can safely fix it after 0.3 beta is released.
Similarly, the collection order issue is indeed a bug and a bit of a drag, but one we can live with for now.
The implicit limit on a user's My Collection size worries me a bit, and your fix for that seems particularly straightforward. I hope in the future we can come up with a different strategy than relying on Lucene for this kind of query, but in the meantime I think we should fix this issue. I've filed a ticket and will fix:
http://issues.fluidproject.org/browse/ENGAGE-437
While we're on the subject, I want to point out one fairly major issue I saw in your ENGAGE-418 patch. Your algorithm for sorting the results from Lucene looks like this:
+ for (var i = 0; i < originalArtifacts.length; i++) {
+ for (var j = 0; j < artifacts.length; j++) {
+ if (artifacts[j].artifactId === originalArtifacts[i].artifactId) {
+ result.push(artifacts[j]);
+ }
+ }
+ }
As I read it, that's a pretty slow algorithm. In terms of O-notation, linearly searching through the entire array of artifacts for each artifact is something like O(n^2). I think using array.sort()--or at very worst, indexing artifacts by ID--would be more efficient than this. Again, let's not worry about this until after 0.3b, but you'll need to revise your algorithm before I can accept this patch.
Hope this helps,
Colin
On 2010-02-22, at 6:34 AM, Svetoslav Nedkov wrote:
> Hi Colin, Justin,
>
> I have reworked My Collection to support language switching. This involved changing the Lucene CouchDB view to support language queries as well as the native Couch view to return the McCord artifact ID.
> Another change is the new parametrized view in Engage config file that limits the returned documents to 256, otherwise CouchDB Lucene returns 25 documents as a maximum.
> When querying for artifacts by McCord ID the documents are returned in the order they were processed, i.e. by their UUID, so I had to reorder the artifacts in My Collection service.
>
> Here is the Lucene view with language index:
>
> "by_id": {
> "defaults": {
> "store": "no"
> },
> "index": "function(doc) {var ret=new Document(); ret.add(doc.artifact.id); ret.add(doc.artifact.lang); return ret;}"
> }
>
>
> The native view that returns the McCord ID:
>
> function (doc) {
> var artifact = doc.artifact;
> emit({
> 'accessNumber': artifact.label.accessnumber,
> 'lang': artifact.lang
> }, {
> 'title': artifact.label.title || artifact.label.object,
> 'artist': artifact.label.artist,
> 'dated': artifact.label.dated,
> 'medium': artifact.label.medium,
> 'dimensions': artifact.label.dimensions,
> 'mention': artifact.label.mention,
> 'accessnumber': artifact.label.accessnumber,
> 'description': artifact.description || "",
> 'mediaCount': artifact.mediafiles ? artifact.mediafiles.mediafile.length.toString() || "0" : "0",
> 'media': artifact.mediafiles ? artifact.mediafiles.mediafile || [] : [],
> 'commentsCount': artifact.comments ? artifact.comments.cnt || "0" : "0",
> 'comments': artifact.comments ? artifact.comments.comment || [] : [],
> 'relatedArtifactsCount': artifact.related_artifacts ? artifact.related_artifacts.cnt || "0" : "0",
> 'relatedArtifacts': artifact.related_artifacts ? artifact.related_artifacts.artifact || [] : [],
> 'image': artifact.images ? artifact.images.image : [],
> 'artifactId': artifact.id
> });
> }
>
>
>
> The patch is attached to:
>
> http://issues.fluidproject.org/browse/ENGAGE-418
>
>
>
> Regards,
>
> Svetoslav
---
Colin Clark
Technical Lead, Fluid Project
http://fluidproject.org
More information about the fluid-work
mailing list