OLIV is a novel end-to-end artificial intelligence-powered assistant system designed to aid individuals with impaired vision in their day-to-day tasks in locating displaced objects. To achieve this goal, OLIV leverages the current advances in AI-based speech recognition, speech generation, and object detection to understand the user's request and give directions to the relative location of the displaced object. OLIV consists of three main modules: i) a speech module, ii) an object detection module, and iii) a logic unit module. The speech module interfaces with the user to interpret the verbal query of the user and verbally responds to the user. The object detection module identifies the objects of interest and their associated locations in a scene. Finally, the logic unit module makes sense of the user's intent along with the localized objects of interest, and builds a semantic description that the user can understand for the speech module to convey verbally back to the user. Initial results from a proof-of-concept system trained to localize four different types of objects show promise to the feasibility of OLIV as a useful aid for individuals with impaired vision.
- Linda Wang lindawangg profile
- Alexander Wong profile
- Anshuman Patnik patnaa2
- Edrick Wong edrickwong
- Justin Wong justinsyde