some questions about uploader

Tue May 13 19:07:12 UTC 2008

Two great questions, one interesting answer: back-end implementation  
and choices.

There are comments inline... and, I warn you, quite a bit of  
digression and musing and not a lot of conclusion. Please forgive me.

On May 13, 2008, at 6:46 AM, Anastasia Cheetham wrote:

>
> Justin, these are some excellent questions! Thanks so much for doing  
> such a thorough job of testing.
>
> I have some technical questions for Eli:
>
>>>    b) If the partial download is not deleted, what happens to it  
>>> if the user clicks "Cancel" (or clicks the remove button) after  
>>> clicking pause? Will the file be left in the queue, will it be  
>>> deleted from the system, or will it just sit on the hdd for  
>>> someone else to delete?
>>
>> My recommended behavior would be that when the back-end gets a  
>> "Cancel" that it should discard the entire upload session since  
>> that is consistent with what Cancel usually means.
>
> I assume this would mean actually deleting any already-uploaded  
> files from the server (e.g. if I pause the upload after 2 of 5 files  
> have been uploaded, then cancel, those 2 uploaded files would be  
> removed from the server).
>
> Does the underlying SWFUploader (which carries out the actual  
> communication with the back-end) allow this action? I might be  
> surprised if it does, as this seems like quite a security risk!

You are right, the client code can't really manage this. This is where  
the back-end gets interesting and where a back-end developer has a lot  
of choice in how complex or simple they want to make their  
implementation. Also it indicates to me a couple of extra messages  
that we should probably pass to the back-end.

Most back-ends are going write the data for each uploaded file during  
the upload to a temp space perhaps  with a temp name.

If the upload doesn't complete then the back-end code must *clean up*  
and remove the temp file.

If the upload succeeds then the back-end can either moves the file to  
it's permanent spot, or it can move it to another less-temporary,  
temporary spot to wait for the next step. Next steps could be letting  
the user preview the uploaded files and then modify the names, add  
meta data, remove the file, etc...

So there is definitely some clean up that has to be done server-side.

Which means that we should send clear Cancel and Done messages back to  
the server so that it knows when to clean up. Although maybe the  
indicator that we're done is the next request.

>>> 3) Currently the user is able to add multiple instances of the  
>>> same file (with the same name) to the file queue and upload them.  
>>> I am assuming that this should not be allowed. Will there be some  
>>> error message given to the user? Similarly will the user be able  
>>> to upload a file with an identical name and type to one that  
>>> already resides in the upload folder or will they be warned?  
>>> Unfortunately I cannot test this out myself right now, as the  
>>> upload process is only simulated.
>>
>> Here is the scenario that the uploader is supposed to be able to  
>> support: user selects file A from folder Foo, and then selects file  
>> A from folder Bar. Foo/A and Bar/A are different files
>>
>> We could have a quick and dirty rule in the Uploader that says we  
>> do not accept two files with the same names, or (better) two files  
>> with the same names of the same exact size.
>
> What does the underlying SWFUploader object do with files of the  
> same name? Does it just ship them off to the server and rely on the  
> server to let it know if the server can't/doesn't want to handle the  
> situation?

Yes, currently the Uploader just pushes up the files as-is, since it  
knows very little about the file. As I said above, we could make some  
guesses about the nature of the file, but they would be just guesses.

So yes, the server needs to provide feedback that it can't handle a  
certain upload.

But what the server can or can't handle really depends on how the  
server-side developer decides to handle files, and what technology  
they are using as a content repository.

If the files are stored in a file system, then this puts file system  
constraints on file naming -- no two files with the same name in the  
same directory. But even in this restricted case, the back-end could  
dynamically rename the files (image-a_1, image-a_2, ...) before  
stuffing them into the directory, and then create a pointer to the new  
file that still used the original file name as the *meta-name*. (this  
stuff gets tricky quickly).

But if the files are being stored in a database, then there are all  
kinds of choices that can be made since the directory/file name  
relationship is no longer the unique identifier. The bigger issue then  
is what is the user model? What is the organizational scheme that  
works best for the user? What are unique identifiers that allow users  
to distinguish between files that they have uploaded?

Ultimately users don't really care where the file is stored. They just  
care that they can find and organize their files in ways that make  
sense to them. The file and folder metaphor works well for most files,  
but we're seeing more and more systems which attempt to organize files  
by various kinds of meta-data and/or the actual contents of the file.  
See smart searches in Mac OS 10.

Another interesting case is the CARET sketches for content management  
which use a model where tags become the organizing principle.

Or, in an environment where specific file metadata is a key identifier  
for the domain in which they are used then that domain specific meta- 
data should form the organizational structure for the files. So art  
files would be organized by artist, period, media, etc.

I'm really digressing. But I guess this digression is to say that the  
Uploader shouldn't make too many assumptions about how the files are  
going to be used or the capabilities of the back-end system.

But we do need to respond intelligently to those capabilities and  
constraints. Another reason why I wanted to get Uploader into the  
hands of real users in the context of real applications.

Let me tease out one set of scenarios. Perhaps just for my own  
amusement.

In a file-system-based content store, I see three scenarios:

- Uploader uploads files as-is, back-end renames files on the fly

- Uploader uploads files as-is, back-end returns an error
	- Uploader then
		a) informs the user
		b) informs the user, and allows the user to suggest a new file name  
to the server
			- server then renames file
		c) informs the user, and allows the user to tell server to rename  
the file

- After queuing but before upload, the Uploader validates each file  
against the server, server returns an error for any duplicate file  
names:
	- Uploader then
		a) informs the user
		b) informs the user, and allows the user to rename the file (which  
then gets sent in post-parameters and the server needs to do the rename)
		c) informs the user, and allows the user to tell server to rename  
the file

No simple answers here. And, of course, what is probably best for the  
user is the thing that adds the most complexity to the component.

- Eli

>
>
> -- 
> Anastasia Cheetham                   a.cheetham at utoronto.ca
> Software Designer, Fluid Project    http://fluidproject.org
> Adaptive Technology Resource Centre / University of Toronto
>

. . . . . . . . . . .  .  .   .    .      .         .              .                     .

Eli Cochran
user interaction developer
ETS, UC Berkeley

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://fluidproject.org/pipermail/fluid-work/attachments/20080513/ff44b3b4/attachment.html>