Capture Wide?field and “Foveated” Imagery

To access to environmental text, We use pan/tilt/zoom (PTZ) cameras to first localize text in wide field of view, and then zoom in to foveate suspected regions due to the fact that the resolution requirement for text detection is different from decoding. Using rapidly refreshable tactile Braille device, we convey private, timely delivery of rich information, and also receive controls to and from a user in a feedback loop.

Incorporate Spatial Prior and Real?time 3D Mapping

Our method uses simultaneous localization and mapping (SLAM) to extract planar “tiles” representing scene surfaces. This work’s contributions include: 1) spatiotemporal fusion of tile observations via SLAM, prior to inspection, thereby improving the quality of the input data; and 2) combination of multiple noisy text observations into a single higher-confidence estimate of environmental text.

1) Spatial Prioritization

2) Combine Multiple Observations

[1] H.-C. Wang, Y. Linda, M. Fallon, and S. Teller. Spatially prioritized and persistent text detection and decoding. In Proc. of Camera-based Document Analysis and Recognition (CBDAR), 2013. Paper

[2] H.-C. Wang, R., Namdev, C., Finn, and S. Teller. Text Spotting for the Blind and Visually Impaired. NSF Young Professional Workshop on Exploring New Frontiers in Cyber-Physical Systems, Washington D. C., USA, 2014. Poster

Decoding of Text and Symbols

Capture Wide?field and “Foveated” Imagery

Incorporate Spatial Prior and Real?time 3D Mapping

1) Spatial Prioritization

2) Combine Multiple Observations