MishaalRahman,
@MishaalRahman@androiddev.social avatar

Gemini Nano with Multimodality just got announced! This new, 3.8B parameter is designed to run on-device and can process not just text input but also audio and images. It’s coming later this year “starting with Pixel” and will be used for:

  • Clearer descriptions with TalkBack. TalkBack will soon be able to automatically generate more useful image descriptions. This will help people with visual impairments who can’t see images, especially when those images don’t have alt text already.

(1/2)

video/mp4

MishaalRahman,
@MishaalRahman@androiddev.social avatar
  • Scam detection. A new feature that processes voice calls to detect and warn you when the person on the other end is trying to scam you. The feature will look for conversation patterns commonly associated with scams, like when a “bank representative” asks you to urgently transfer funds. If so, your phone will buzz and show a warning. This feature will be opt-in and audio is processed on-device by Gemini Nano with Multimodality, meaning your voice calls don’t leave the device.

(2/2)

cassidy,
@cassidy@blaede.family avatar

@MishaalRahman this is the kind of ML-powered feature I’m actually happy about!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • kavyap
  • thenastyranch
  • tester
  • GTA5RPClips
  • InstantRegret
  • DreamBathrooms
  • ngwrru68w68
  • magazineikmin
  • everett
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • cisconetworking
  • JUstTest
  • khanakhh
  • normalnudes
  • osvaldo12
  • cubers
  • tacticalgear
  • Durango
  • ethstaker
  • modclub
  • anitta
  • megavids
  • Leos
  • provamag3
  • lostlight
  • All magazines