Imagine if you could settle/rekindle domestic arguments by asking yourwhen the room last got cleaned or whether the bins have already been taken out. Or — for a healthier use case — what if you ask your speaker to keep count of reps as you do squats and bench presses? Or switch into full-on ‘personal trainer’ mode — barking orders to peddle faster as you spin cycles on a dusty old (who needs a Peloton!). And what if the speaker was smart enough to know you’re eating dinner and took care of slipping on some mood music?
Imagine if all those activity-tracking smarts were on tap without any connected cameras in your home. Another bit of fascinating research from researchers at Carnegie Mellon University’s Future Interfaces Group opens up these possibilities — demonstrating a novel approach to activity tracking that does not rely on cameras as the sensing tool. Installing connected cameras inside your home is a horrible privacy risk, which is why the CMU researchers set about investigating the potential of using millimeter wave (mmWave) Doppler radar to detect different types of human activity.
The challenge they needed to overcome is that while mmWave offers a “signal richness approaching that of microphones and cameras”, as they put it, data sets to train AI models to recognize different human activities as RF noise are not readily available (as visual data for training other types of AI models is). Not to be deterred, they set about synthesizing Doppler data to feed a human activity-tracking model — devising a software pipeline for training privacy-preserving AI models. The results can be seen in this video — whe model correctly identifies several activities, including cycling, clapping, waving, and squats. Purely from its ability to interpret the mmWave signal the movements generate — and purely having been trained on public video data.
“We show how this cross-domain translation can be successful through a series of experimental results,” they write. “Overall, our approach is an important stepping stone towards significantly reducing the burden of training such as human sensing systems and could help bootstrap uses in human-computer interaction.” Researcher Chris Harrison confirms the mmWave Doppler radar-based sensing doesn’t work for “very subtle stuff” (like spotting different facial expressions). But he says it’s sensitive enough to detect less vigorous activity — like eating or reading a book.
A need for line-of-sight between the subject and the sensing hardware also limits the motion detection ability of Doppler radar. (Aka: “It can’t reach around corners yet.” Which, for those concerned aboutdetection, will surely sound slightly reassuring.) Detection does require special sensing hardware, of course. But things are already moving on that front: Google has been dipping its toe in via project sensor to the Pixel 4, for example. Hub also integrates the same radar sensors to track sleep quality.
“One of the reasons we haven’t seen more adoption of radar sensors in phones is a lack of compelling use cases (sort of a chicken and egg problem),” Harris. “Our research into radar-based activity detection helps to open more applications (e.g., smarter Siris, who know when you are eating, making dinner, cleaning, or working out, etc.).” Asked whether he sees greater potential in , Harris reckons there are interesting use cases for both. “I in both mobile and nonmobile,” he says. “Returning to the Nest Hub… the sensor is already in the room, so why not use that to bootstrap more advanced functionality in a Google smart speaker (like rep counting your exercises?
“There are a bunch of radar sensors already used in the building to detect occupancy (but now they can detect the last time the room was cleaned, for example).” “Overall, the cost of these sensors is going to drop to a few dollars very(some on eBay are already around $1), so you can include them in everything,” he adds. “And as in your bedroom, the threat of a ‘surveillance society’ is much less worry-some than with camera sensors.” Startups like VergeSense already use sensor hardware and technology to power real-time analytics of indoor space and activity for the b2b market (such as measuring office occupancy).
But even with local processing of low-resolution image data, there could still be a perception of privacy risk around using vision sensors — certainly in consumer environments. Radar offers an alternative to visual surveillance that could be a better fit for privacy-risking consumer-connected devices such as ‘smart mirrors. “If it is processed locally, would you put a camera in your bedroom? Bathroom? Maybe I’m prudish, but I wouldn’t personally,” work in the dark.”. He also points to earlier that underlines the value of incorporating more types of sensing hardware: “The more sensors, the longer tail of interesting applications you can support. Cameras can’t capture everything, nor do they
“Cameras are pretty cheap these days, so it is hard to compete there, even if radar is cheaper. I do believe the strongest advantage is privacy preservation,” he adds. Of course, having any sensing hardware — visual or otherwise — raises potential privacy issues. For example, a sensor that tells you when a child’s bedroom is occupied may be good or bad, depending on who has. (Do you want your smart speaker to know when you’re having sex?) And all sorts of human activity can generate sensitive information, depending on what’s happening. While radar-based tracking may be less invasive than other sensors, it doesn’t mean there are no potential privacy concerns.
It depends on where and how the sensing hardware is being used. Albeit,to argue that the data radar generates is likely less sensitive than comparable visual data were it to be exposed via a breach. “Any sensor should naturally raise the question of privacy — it is a spectrum rather than a yes/no question,” agrees Harris. “Radar sensors are usually rich in detail but highly anonymizing, unlike cameras. If your Doppler radar online, it’d be hard to be embarrassed about it. No one would recognize you. If cameras from inside your house , well….”
“It isn’t turnkey, but there are many largefrom (including Youtube-8M),” he says. “It is orders of magnitude faster to download video data and create synthetic radar data than having to recruit people to come into your lab to capture motion data. “One is inherently 1 hour spent for 1 hour of . Ese days. For , it takes us about 2 hours to process, but that is just on one desktop machine we have here; in contrast, youu can download hundreds of hours of footage easily from many excellently curated video databases. The lab. The key is that you can parallelize this, using Amazon AWS or equivalent, and process 100 videos at once, so the throughput can be extremely high.”
And while RF signal does reflect, and do so to different degrees off of other surfaces (aka “multi-path interference”), Harris capabilities “by extracting big surfaces like walls/ceiling/floor/furniture with computer vision and adding that into the synthesis stage”.) “The [Doppler] signal is actually very high level and abstract, and so real-time (much less ‘pixels’ than a camera),” he adds. “Embedded car processors use radar data for things like collision braking and blind spot monitoring, and those are low-end CPUs (no or anything).”“is by far the dominant signal”.Theyy need to model other reflections to get their demo model working. (Though he notes that could be done to further hone
The research is being presented at the ACM CHI conference alongside another Group project — Pose-on-the-Go — which uses smartphone sensors to approximate the user’s full-body pose without needing wearable sensors. CMU researchers from the Group have also previously demonstrated a method for indoor ‘smart home’ sensing on the cheap (also without the need for cameras), as well as — last year — showing howcould be used to give an on-device AI assistant more contextual savvy.
In recent years, they’ve also investigated using laser vibrometry and electromagnetic noise tobetter environmental awareness and contextual functionality. Other interesting research out of the Group includes using conductive spray paint to turn anything into a touchscreen. Various methods are used to extend the interactive potential of wearables — such as by using lasers to project virtual buttons onto the arm of a device user or incorporating another wearable (a ring) into the mix. The future of human-computer interaction looks certain to be much more contextually savvy — even if current-gen ‘smart’ devices can still stumble on the basics and seem more than a little dumb.