Merged:
To do:
Integrate GL controls for main window.
Fix bug where interval and scrubber controls in the main window change the frame on the subwindow.
Whiteboard Wizadry:
View image
View image
View image
View image 1
View image 2
View image 3
View image 4
Interval and scrub sliders no longer send requests for frame numbers beyond the scope of the particular movie that is loaded (no negative frames and no frames beyond the final frame):
Also look at Tripp's post/patch for today:
Scrubber now controls middle window... but the rendered frame numbers lag behind by one when changing the interval...
Version control:
Tripp and Kurt's work merged. No current frame number implementation yet.
Maybe this could be a source of inspiration for the Microsoft symposium.
Even though I believe that our species has a tendency to overestimate our effect and importance in the universe, when I think about the space we inhabit, it occurs to me that we are the most powerful force for radical change in our environment. The point that I would like to make is not one of political ideology, although I am concerned with the traditional issue that "environmentalism" engages, but rather one of landscaping. When you look around, think about the world as you perceive it and ask yourself what percentage (quantitative and qualitative) of the sensations that you experience that are caused by the actions of humans. On reflection, it occurs to me that people are the environment rather than the earth or the artifacts they leave behind. Social solipsism?
Social networking and community building possibilities for Patholog:
1. Familiar Strangers
Something the Naimark was talking about yesterday... provacative and controversial yet very interesting. An interface for visualizing repeated contact at a certain proximity with strangers, those that are not members of your group community or "buddy list". This feature would not be available in real time to preserve the anonymity/privacy of strangers. Seems like it would be valuable to require an "opt in" method for this functionality.
2. Social State and Awareness
The model of instant messaging chat clients should be explored for community relations. Beyond an on/off function, the ability to set states that reflect the type of social interactions that the user wants to participate in is extremely important. A type of explicit receptivity could be set to something as detailed as "having lunch with friends, come join me if you're in the area" which would open your visibility within the system and alert your friends who have the same (or similar) setting.
3. Surveillance Mode (or Parent Mode or Stalker Mode)
Real time data. Boundaries that set off alarm. Listing of nearby friends.
4. Send me an angel
In a trouble situation, something less than a 911 emergency, an active beacon that alerts your friends that you need a some help. It allows you to transmit your present location in real time to your friend's device so that they will have compass heading information to find you.
5. Voyeur/Exhibitionist
This seems natural to me as both a privacy concern and an expressive capability... don't quite know what this will look like yet.
6. Traffic
Just a mark up of all the data from the servers. Find interesting ways to observe foot traffic in crowded metropolitan areas. A possible source of funding would be to package this data and sell it to marketers... I shudder at the thought but it is a reality.
In our meeting yesterday, Tripp and I discussed techniques for visualizing clusters of video in relational and geographic space. This issue overwhelmed our other significant challenge: selecting a meaningful analysis of the video. I am convinced that there are two major axes that must anchor our project: the temporal axis and the spatial axis. Without going into a long diatribe, my argument is that after the senses, temporal and spatial perception are the framework of our perspective on our environment. We like to see spatial and temporal representation of information, there is a deep satisfaction and great deal of insight gained through good visualization of data.
We do not (emphasis) analyze our visual experience in terms of brightness and contrast when we are constructing a mental mapping of our reality. Nevertheless, the capability of machines to provide us with data and metadata, extrasensory for humans, can be leveraged for projects where the additional information enriches our view of the environment.
My feeling is that the visualization of the video takes precedence and that the more mechanistic and abstract relationships between sequences will be the most meaningful when they are viewed in the context of space and time. So our napkin-based brainstorm produced a model of geographic space along the x,y that leaves a 3D map as a trail as it moves along the z axis. This is a complicated and dense visualization that needs to be seen and manipulated before it can be evaluated. The key would be in the creation of a smooth transition between this higher view and then subsequent closer views that drill down to the full-screen video itself.
Wish list.
People (Thesis Committee)
Scott Fisher
Peggy Weil
Joi Ito
People (Production Team)
Want: Will Carter, Todd Furmanski and Tripp Millican
Need: a C/C++ or Java programmer with 2D graphics experience
Tools
Each user in field will require:
1 GPS device
1 tablet/PDA computer
1 digital camera
1 mobile phone (data).
My guess is that 5 users in the field in an area the size of USC would generate the density required for really interesting interactions to start occuring. Perhaps 3 or 4 users could work but I have my doubts...
The development machines should be:
1 desktop for programming/development with serial and wireless connectivity
1 notebook for tweaking in the field with serial and wireless
1 external hard drive for backup and synchronization
Although it would be convenient for me to work exclusively in GUI authoring tools, it seems very unrealistic. So I expect that a C or Java programming environment will be necessary for both development machines. And of course I will use Adobe Photoshop for visualizations and mock-ups.
Money
GPS device - $200
tablet/PDA - $600 - 1000
camera phone - $500
1 year of mobile service -$600
desktop dev box - $1500
notebook dev box - $1500
external hard drive - $400
software expenses - $2000
finishing your MFA thesis on time, graduating and getting a real job - priceless
ugh... I cringed as I wrote that. Sorry to be so tacky.
So about $2000 per user in the field, at 5 users that is $10,000 plus $5400 for the development machines comes to $15,400. Let's just round it up to $20,000 (for coffee, alcohol and parking tickets) because it reminds me of a book I read when I was a kid.
Possible sources of funding:
Research branches of Microsoft, Intel, Sony, AT&T, Docomo
While discussing the midterm project that Tripp and I are collaborating on, my concerns centered around the capability of current technology to analyze video in a way that is interesting to a user that hopes to discover something about a "character" by viewing a continuous POV shot. The hardware/software that we will use will be off the shelf, meaning essentially a webcam taped to a head and Max/MSP Jitter or SoftVNS to evaluate/manipulate the video. Once a user hands the footage over to the system, there are severe limits on the kind and quality of metadata that can be harvested. I don't mean to underestimate the value of a digital sort on video content, I am most worried that such a mechanized look at human perspective will tell us more about the system's capabilities than about the internal mental state of the user.
So I propose that we simultaneously pursue a minimally intrusive "live" user annotation system that would allow points of interest to be marked while in the field. By merely adding a flag at specific points in the timecode, we can learn about the moments that are meaningful to the user. I know that Tripp is primarily interested in what a data management system can accomplish without any hints from the user, but by allowing ourselves the insight into the sequences that the user cares about, we can then isolate those shots and try to figure out what they have in common.
Ultimately I am interested in a model of active capture vs. passive capture for expression, so the idea of recording and storing everything that I see is initially content-less to me. There is immense value however in the capability to record and store everything, a capability that we are inevitably approaching, especially when the user can match his/her thoughts to an image or a sound or any other sensed media.