We are creating components of software infrastructure capable of supporting interactions among people, spaces and mobile devices on scales ranging from single rooms to inter-continental multi-person collaborations.
Collaboration support is the main target application area where we apply and evaluate our technology. We recognize that collaboration happens in scheduled formal meetings as well as spontaneously during chance encounters. Finally, some of the work in groups happens off-line when people contribute comments, opinions and information asynchronously.
We are developing and evaluating new interfaces for people to interact with their environments.
| Collaboration Support | Infrastructure | Human-Computer Interfaces | |
|---|---|---|---|
|
|
|||
|
|
||
e21 is our longest-running facility, a conference room instrumented fully with ubiquitous technologies. Its features include large whiteboard wall surfaces which double as four integrated projection displays, speech input through hand-held close-talk microphones, an LED display for weather and news reports, a five-speaker sound system that can `rotate' the sound focus to any wall, two steerable and two stereo cameras for detecting and tracking people, and a ceiling-mounted microphone array for detecting sound. A wall-mounted LCD touchscreen is used for room configuration and status displays.
e21 is a working conference room, used constantly for informal and formal meetings, exploring the use of ubiquitous technologies for meeting support, demonstrations, and vision-based tracking applications.
To build workspaces that effectively support people in everyday work situations, one needs to focus on both the technology and the physical environment in which the work takes place. In building our intelligent workspace, we are doing just that. Working from scenarios we developed, we are focusing on three areas of research. First, we are exploring the use of mobile furniture to create a dynamically reconfigurable workspace: as the physical arrangement of the space changes, the technology supporting the space should accommodate accordingly. Secondly , we are focusing on tools to enable this sort of dynamic reconfiguration: we are developing novel computer vision and other sensing technologies for inferring activities in a space. With knowledge of the activities in which a person is engaged, an intelligent workspace can offer relevant assistance. Our third area of research focuses on the study of how people use an intelligent workspace such as the one we are building. We plan to conduct workplace studies by having people use the space for extended periods of time, thus allowing us to evaluate our work and to engage users in an iterative design and development cycle.
The Ki/o project transforms informal public gathering spaces such as hallways, break areas, and elevator lobbies into intelligent environments like 835 and e21. This requires embedding computation, communication, and perceptual capabilities into the walls of these spaces, and to develop perceptually-enabled applications appropriate for these spaces.
Several Ki/o kiosks have already been installed around the AI Laboratory. These kiosks will soon grant users direct access to useful location and context-based information and services.
Please see the ki/o working group site for more information.
The h21 mobile device is an personal intelligent "space" that a user carries around throughout the day. The base h21 platform consists of Compaq iPaq hardware augmented with extra hardware including a CCD camera, wireless LAN, extra flash memory, and an accelerometer. Tablet PCs are currently being also adapted into mobile AIRE spaces.
As part of Project Oxygen, several faculty offices (nine at last count) have had the Metaglue platform deployed in them. While each office has its own needs, most of them contain the necessary infrastructure for controlling drapes, lights, and projectors through voice commands (either through close-talk wireless microphones or desktop microphone arrays). The occupants of these offices often use them for simple demonstrations of the AIRE technologies, or occasionally for programming their own agents.
Our main agent infrastructure, Metaglue, is the base for the rest of our projects. This multi-agent distributed system provides communication and levels of abstraction for building adaptable systems for intelligent environments. Metaglue allows for the easy replacement or addition of modules such as resource management, security, preferences or modal interfaces. Native to agents in Metaglue is inter-agent communication, persistent storage backed by a chosen database and configurable agent-specific attributes. Agents can, if necessary, specify the need to work on a specific host or with a particular agent. As a system, Metaglue will start agents when called upon if they are not running at that time.
As our technology expands to cover more places and people, it becomes increasingly difficult to write applications that effectively use large numbers of diverse resources. For our purposes, resource management involves both selecting appropriate resources for a given application, and allocating these scarce resources among applications. Existing resource managers fall short of fulfilling these needs on a large scale. As such, we are working on a new system that operates more effectively, especially with multiple users and locations. This system takes a more decision theoretic approach to resource request arbitration.
We are building a new level of discovery and communication services that will allow Metaglue-enabled spaces to interact with one another. This new infrastructure layer, called Hyperglue, will also allow software running on behalf of people, groups of people, institutions and information sources to become part of the global communication network.
Design of Hyperglue is based on two main concepts: delegation and high-level service discovery. If we imagine a scenario, where Bob wishes to communicate with Anne (not knowing her location), then Bob's personal communication agent will first contact Anne's communication agent through Hyperglue. The two agents will negotiate and decide if the communication request should be granted and what modes of communication are permitted. Then both Bob's and Anne's personal agents need to negotiate with their surrounding environments for the use of necessary resources.
The devices that occupy the Intelligent Room define the boundaries of its functionality. While computers obviously play a major role, less complex devices extend user interaction beyond the reach of traditional computing environments. The Intelligent Room incorporates many computer-controlled devices including lights, audio equipment, projectors, sensors, and servos. My work involves standardizing the way the state information of these devices is handled. This convention will allow developers to easily add support for novel devices.
Contact Tyler Horton with questions about this project.
Today's cell phones are fast becoming computing rich devices. Bluetooth-enabled camera phones can serve as a platform for a number of novel applications. This line of research seeks to develop these applications, with a particular emphasis on using a phone as (1) a device to interact with an intelligent space and (2) as a personal token that a person could use to identify himself to that space.
We are working currently on deploying simple sensors throughout our instrumented spaces. In addition to speech, vision and graphical interfaces, these sensors would provide us with simple yet robust information about motion in a space, noise and light levels, pressure on such surfaces as the floor and furniture. These sensors will help us better judge the state of the interactions in a space as well as provide feedback about the state of some of the space's physical resources such as lights, speakers, drapes controllers, etc.
There are two components to this project: firstly, we need to deploy the sensors in a way that makes them accessible from any of the computers controlling the space. Secondly, information from numerous sensors needs to be combined to provide a high-level picture of the state of the interactions in a space.
A Pervasive Computing Environment (PCE) is equipped with many devices, sensors, and actuators that allow it to respond to the needs of its occupants. As the number and complexity of these PCE-embedded components grow, so too does the potential for unexpected behavior. A comprehensive troubleshooter should be installed in PCEs that allows typical users to overcome these inevitable glitches.
Research for Tyler Horton's Master's thesis focuses on creating SPACED, an automatic debugging system for the E21 Conference Room. SPACED uses knowledge about the functionality and communication model of devices in E21 to recommend corrective actions to users when a problem is detected. Once deployed in E21, the SPACED knowledge base can be updated to reflect future hardware changes. Additionally, the knowledge base can be completely replaced; allowing SPACED to work in any PCE.
k:info is a knowledge-based information display engine which
retrieves and selects news items to bring to the attention of
the user based upon contextual clues about the user and
the environment.
k:info is the first application designed specifically for the
Ki/o kiosk platform.
Please see the the project web site for more information.
Contact electronic Max with questions about this project.
Metachat is a experiment in computer mediated communication to see how interactions popularized recently by instant message systems on the desktop can be extended to include the physical environment around us.
SAM is an HCI micro-project aimed at developing an expressive \ and responsive user interface agent for e21. Currently, SAM consists of a minimal representation of an animated face, which conveys emotional states such as confusion, surprise, worry, and anger.
SAM is one of the output modalities of a larger project, which is to make e21 affective and emotion- enabled. Inspired by the humanness of HAL in 2001: A Space Odyssey , this project aims to make the Intelligent Room both responsive to user affect (curiosity, frustration, etc), and expressive of its own state in emotional terms. These emotions would be expressed by SAM, as well as through spoken inflection in TTS output.
In collaboration with the Vision Interface Project (VIP) we have developed a new interface for engaging the speech recognition capabilities of the environment into our activities, such as meetings. Look-To-Talk (LTT) interface (see paper) involves an animated avatar representing the room, and head-pose tracking. The environment gets ready to engage in a spoken interaction with the user if the user faces the avatar. If, instead, the user faces other people in the space, the speech recognition system of the environment remains inactive thus preventing accidental response to utterances that are really directed at other humans and not at the environment. Currently, further work is planned on rigorous evaluation of LTT and on further improvements to the system.
DIVA is a multi-modal output system that delivers information to users in an intelligent fashion. When users are in a meeting room listening to a presentation, most information will be delivered via visualization on a screen. Extra information can be presented on smaller screens, and emergency information can be broadcast via audio. When a single user is browsing the same presentation on a mobile phone, DIVA will compactly present visual information on the screen with more emphasis on the use of audio. When this user moves into an offices with a desktop computer and an officemate, the way the information is presented must change quickly. In this project we deal with three separate but interlinked problems. First, the system must utilize efficiently the cognitive capacities of the users. Second, the system must decide the best output channels for delivering different types of information. Third, the system must take advantage of the available output devices. Finally, all of this must be done dynamically, as the environment and the activity that the user is engaged in may change at any given moment.
O2flow is a Metaglue package for capturing, multicasting, and rendering real-time audio and video media streams. O2flow maintains its own MBONE-compatible multicast sessions and offers multicast directory lookup services to all Metaglue agents. O2flow relies on Sun's Java Media Framework, and requires full JMF-compatibility (currently available on Win32 or Linux).
Next steps include creating distributed media storage repositories for media archiving, indexing, and retrieval, and integrating O2flow with the eFacilitator for meeting capture.
This goal of this project is to integrate speech into the existing sketching system, ASSIST. ASSIST is a project of the Design Rationale Group. This work is intended to create a multi-modal environment in which mutual disambiguation of the input modes will help identify the user's intentions.
In the future, the infrastructure in the Intelligent Room could add other modalities that would assist in the recognition of the users intentions. While sketching and talking, users also gesture at parts of the drawing to identify certain parts of the sketch or to indicate actions. The Intelligent Room could provide this data which would aid in the identification of the user's intentions and in disambiguation.
The goal of this work, done in collaboration with the EWall Project, is to provide arrangement algorithms of existing knowledge to inspire alternative ways of understanding by the use of visual and conception connections between separate Information Objects through observing how users select and organize presented information, further related inquires, and developments. Surveys and evaluations of existing information arrangement algorithms and visualization tools have been done along with suggestions of new additions and modifications. A design and a prototype consist of a few primitive arrangements are built for testing and comparison purposes. Future work will be focused on developments of more complex algorithms and testing. The algorithms will be used to present a database as well as to visualize and analyze results of user tests from other projects.
Planlet is a generic software layer for representing user tasks as plans. Planlet makes it easier to build proactive ubiquitous computing (ubicomp) applications by providing generic services to hold and manipulate knowledge about user plans, habits and needs. The knowledge embedded in Planlet can be used by ubicomp applications to reduce task overhead and user distractions. For instance, applications can pro-actively remind the user to perform planned task steps; they can automatically configure the user's working environment; and they can guide the user in following best-known practices. Planlet is currently being used to build an application that will help AIRE group members give tours of our different AIRE-spaces by doing such things as automatically setting up demos at the next AIRE-space in the tour and informing the tour guide that he may run out of time before he can finish his tour.
Fleets of UAVs can carry out missions in a number of fields such as space exploration and search-and-rescue. However, human supervision is still required to monitor the status of the fleet to ensure that the mission is being carried out as planned. To make it easier for human supervisors to monitor the status of these missions, we used Planlet to create a Command Post application that presents an aggregated and abstracted view of the status information reported by the UAVs. The UI presented by the Command Post is based upon the idea that the mission the UAVs carry out can be thought of as a plan. Using this model, the Command Post presents mission status at the level of goals and plans and it directs operator attention to the UAVs that require his attention.
The plan-based diagnosis and recovery system works with both the new resource manager and Planlet to make the room more robust. As our spaces become more complex, it becomes more and more difficult for users to understand what is happening in the room when a specific task is being carried out. As a result, when the room experiences a failure it is difficult for a user to manually isolate and recover from the problem. This system is designed to automatically deal with various types of failures in a way that is transparent to the user.
All tasks in the room can be broken down into steps, which can be strung together to create a plan. When a step in a plan fails, this system is designed to isolate the problem and generate a solution. Using Bayesian nets, the exact failure point is determined. This information is then used by recovery and the resource manager to devise a solution, which would allow the plan to continue unimpeded. This process is automatically triggered by the failure and requires no action on the part of the user.
is a
software system to support and capture meetings. It contains the eNotePad,
an electronic note-taking application that lets meeting participants
organize, annotate, and share notes and documents on a Tablet PC or
electronic whiteboard. We use distributed agent technology to automate the
collaboration, sending out notifications of upcoming meetings and setting
up conference rooms with presentation materials. The system also
automatically captures all meeting audio, note-taking activity, and slide
or web page transitions. A playback tool called the NotePlayer helps a
user retrieve a particular portion of a meeting. They find a note or slide
of interest and play back the meeting from when that item was created.
They hear the captured meeting while watching the note-taking replay in
real time. The eFacilitator can be downloaded
here.
In collaboration with the Design Rationale Group (DRG) we are working on a system for capturing and indexing software design meetings. The system combines our work on meeting facilitation and capture with DRG's work on Tahuti, a system for understanding hand-drawn UML sketches.
During software design meetings, designers sketch object-oriented software tools, including new agent-based technologies for the Intelligent Room, by sketching UML-type designs on a white-board. To capture the design meeting history, our meeting capture system uses available audio, video, and screen capture services to capture the entire design meeting. However, finding a particular moment of the design history video and audio records can be cumbersome without a proper indexing scheme. To detect, index, and timestamp significant events in the design process, Tahuti records, recognizes, and understands the UML-type sketches drawn during the meeting. These timestamps can be mapped to particular moments in the captured video and audio, aiding in the retrieval of the captured information.
A tool capable of quickly and effectively visualizing the content of a meeting has been long sought and widely researched. Such a tool should innocuously record the progress of a meeting and then show its content in a manner that encapsulates the format of the meeting while providing tools that facilitate in analysis.
MeetingView seeks to encapsulate these virtues by recording meetings or conversations in an intelligent space and presenting them in a way that emphasizes the time and source sensitive aspects of the information being conveyed. Bridging the EWall and Intelligent Room groups, MeetingView aims to use the rich recording resources intelligent spaces provide in order to unobtrusively gather information concerning a discussion. The future goal is for the room to be able to recognize the speaker and transcribe what he/she is saying. This information is then fed into a EWall View interface (similar to the NewsView tool) for display. Part of developing MeetingView will focus on developing a more robust EWall View tool (agent) that can be used for a larger range of applications.