Tuesday, October 31, 2006

Paper: "EnsemBlue: Integrating Distributed Storage and Consumer Electronics"

EnsemBlue: Integrating Distributed Storage and Consumer Electronics
Daniel Peek and Jason Flinn, University of Michigan


EnsemBlue is a distributed file system for personal multimedia that incorporates both general-purpose computers and consumer electronic devices (CEDs). EnsemBlue leverages the capabilities of a few general-purpose computers to make CEDs first class clients of the file system. It supports namespace diversity by translating between its distributed namespace and the local namespaces of CEDs. It supports extensibility through persistent queries, a robust event notification mechanism that leverages the underlying cache consistency protocols of the file system. Finally, it allows mobile clients to self-organize and share data through device ensembles. Our results show that these features impose little overhead, yet they enable the integration of emerging platforms such as digital cameras, MP3 players, and DVRs.


Blogger Anthony Nicholson said...

Official scribe comments:

Daniel Peek described EnsemBlue, a framework for integrating consumer electronic devices (CEDs) into commodity distributed file systems (DFS's). This has been difficult due to the closed nature of CEDs, because such devices cannot simply run the distributed file system's client software in order to integrate its storage with all the user's other computing devices. Instead, Dan described how EnsemBlue leverages the user's general-purpose computers (such as desktops and laptops) to act as a bridge between the distributed file system and each CED that connects to the computer (like when an iPod syncs with a user's desktop machine).

A key challenge here is namespace conflicts between the DFS and the proprietary naming structures found on CEDs. EnsemBlue handles this by tracking mappings between the name of an object in the DFS and its name on each given CED. Since the CEDs are a closed system, the general-purpose computers in the system must execute all custom code in the system. These computers therefore need to know when data is updated in the system, to take certain actions like updating custom indexes on CEDs. The authors leverage the fact that every DFS has a distributed notification protocol already---the cache consistency mechanism. Therefore, the authors introduce the concept of a "persistent query," which is an object in the DFS that indicates what the query is looking for, such as new mp3 files added to the DFS. The authors presented an example of how a persistent query for all new m4a files could be used to implement a transcoder from m4a to mp3 files. Finally, the authors described how they handle disconnected devices that cannot speak with the general file server. In such situations, several of the user's devices might be able to contact each other but not the remote file server. One of the devices becomes a "pseudo-file server", acting as a file server to the best of its abilities, serving those files to the other devices that it happens to have at the time.

One questioner ask how this work would fit into the Universal Plug-and-Play (uPnP) initiative. Dan responded that currently, EnsemBlue requires read and write access to the device (through USB, for example). Their future work will allow them to work with arbitrary protocols like uPnP. Jawwad Shamsi from Wayne State University asked if they require a seperate general purpose device for each mobile device? Dan answered that any number of CEDs can connect to any number of general-purpose computers. Christopher Stewart from the University of Rochester asked if they had encountered any performance tradeoffs in building the protocol that interacts with the dedicated host machine. Dan responded that they hadn't measured the performance of integrating data back to the DFS through the general purpose computer, but since their system is weakly-consistent anyway, such performance would not be that important.

6:42 PM  

Post a Comment

<< Home