Robots are watching us. Literally.
Google has curated a set of YouTube clips to help machines learn how humans exist in the world. The AVAs, or “atomic visual actions,” are three-second clips of people doing everyday things like drinking water, taking a photo, playing an instrument, hugging, standing or cooking.
Each clip labels the person the AI should focus on, along with a description of their pose and whether they’re interacting with an object or another human.
“Despite exciting breakthroughs made over the past years in classifying and finding objects in images, recognizing human actions still remains a big challenge,” Google wrote in a recent blog post describing the new dataset. “This is due to the fact that actions are, by nature, less well-defined than objects in videos.”
SOURCE: Lauren Tousignant
New York Post