ai Vision

Group Member

Annie Chen

Edy Yan

Carina Qu


Microsoft Word

Mocking Bot

Adobe Illustrator


Design an application that assists blind and vision limited users in identifying everyday items. This may include things like money, price tag information, or description of other objects identified in the environment. It may also include things such as integration of QR code identification or local information provided by wireless beacons or similar technologies.

User Research:

In class, we met and discussed with two potential users Vincent and Synge.

         Vincent is blind, his vision status is that he can see light, no blackness or darkness, no perception of black. The electrical impulses come out as different colors. He can only see lime green, blues greys and he can tell where is the light but not able to see clear light in the classroom.

         Synge is a low vision person who can see some objects depends on their distance.

          Both of them use iPhones since it has good accessibility; easy to use voiceover and easy to text. Synge also mentioned that the IOS platform on iPhone considers the needs of different users, including blind and low vision users. One example Synge gave in class is that books on the iPhone are accessible to everyone, which is really convenient. 

Apps that both of them frequently used are:

Name: Seeing AI 

Developer: Microsoft

Platform: IOS 

Name: TaptapSee

Developer: CloudSight

Platform: IOS 

Other devices they use to aid: 

  1. Bone conduction: vibrating, the sound is like coming from inside the ear

  2. Magnifier: used when reading books or recipes

  3. Use Zoom in in phones and laptop

Bone conduction

User: Low Vision/ Blind


User: Low Vision

Zoom Function in Phone

User: low Vision 

Our Design:

AI VISION is a free application on the IOS platform. Designed for the blind and low vision community, using the power of Artificial intelligence to open up the visual world for them. This app is optimized for use with Voiceover for guidance; it enables users to recognize text, document, people, objects, scenes, currency, color, barcode, QR code and photos in an album.

Brain Storming:

Design Details:

Theme Color

According to our research with blind and low vision users (Vincent and Synge), we found that blind users are able to see light, with no lightness or darkness, no perception of black as well. They can see lime green, blues, and greys. Therefore we decided to go with the vintage blue-grey color as the theme color. 






Our logo features the ‘CAMERA LENS’ as our ‘EYE’, indicates that the digital camera can see an object for us and be our eyes.

Main Screen

There is a taskbar on top of the screen containing two icons on the sides. The main part is the camera screen. At the very bottom, there is a function bar containing three major icons. When an object is identified, an information display bar will appear above the three icons which show the written description for low vision people to see.

Human factor principles:

  • There are no or minimal word display areas on a screen in most of the existing apps for blind and low vision users. This word display area may not be important to blind users since they cannot see, however for those with low vision, having a visual word description of the identified product is necessary. 

  • We want the word description area to be large in order for low vision users to better see, however, we also want the camera screen area to have a large size. There must be a tradeoff between the two major areas. During our early period of discussion, we wanted to keep the large camera screen display and shrink the word description area by adding the scroll bar at the side.

Three Major Function Categories

The design of the three functional icons is one of our major design. After did the research about the existing apps for blind and vision-impaired people, we found that some apps containing nearly a dozen functional icons either have a swipe bar or display them all together on the main screen. But in our design, we decided to go with only THREE icons.

Why choose these three icons on the screen? (Text, Scene, Barcode)

  • In seeing Ai, it can identify short text using the ‘Document’ function, thus there is no need to have two separate icons to identify text and document, and we named the new icon ‘TEXT’

  • In Seeing Ai, there are four separate icons: person, money, color, and scene, however in ‘Taptapsee’ there is only one general category that can identify all of these features. In order to provide more convenience to low vision/blind users, we decide we put one icon named ‘SCENE’ which is able to identify people, color, money, and scene.

  • We decided to put the third icon named ‘Barcode’. This icon is mostly used in malls or markets where users are able to scan the barcode on the object to get the SPECIFIC description of the objects. Users are also able to use this function to scan QR code to get information in everyday life

Why selecting icons don’t need to swipe?

  • Seeing Ai has Chanel bar swipe to select each icon, which is difficult for low vision to operate when voiceover is off

  • Seeing Ai has too many functions that are rarely used in daily life. According to Human factor principles: a swipe bar is usually used for people who have no problems identifying things, and they know where to swipe the bar back and forth. However, in this case, a swipe bar will cause difficulties for low vision and blind people; they need two hands to swipe and select the desired function. 

Why put the ‘scene’ icon in the middle and ‘text’ ‘barcode’ button at the sides?

  • According to our research and observation of blind and vision-impaired users, we found that recognizing daily object is the most frequently used function compared to the others

  • Considering the human factor principle, display sequence affects usability. Similar to how people take photos by tapping the capture button in the middle, we decided to put the ‘scene’ button in the middle where it is the most comfortable and frequent place for users to tap. The other two buttons are located at two sides.

Design of icons, why no text put underneath?

  • In Seeing Ai, we see a word under each icon to indicate what this icon is, such as ‘Document’ word under the document icon. However, in our design of the icons, we decided to not put any words under the icon, in other words, our icons can clearly let the user know its function. Letter ‘A’ represents text and document; a 3d cube represents ‘scene’ and ‘ object, and the last ‘barcode’ to represent the function of scanning barcode and QR code.


  • When users want to identify objects, scenes, people, and colors, they can click the middle icon which represents this function. They will hear audio guidance ‘Scene’ when they click on the right button. 

  • Afterward, users click the screen to take a picture, and they will hear audio saying ‘processing’  which means the system is processing the photo. 

  • After several seconds, users will see the word description on the screen and hear the audio description at the same time.

  • If users want to save the current picture for future reference, they can choose to save the picture by clicking the ‘download’ icon on the top right corner. Users will hear audio saying ‘saved’ to indicate the current picture has been successfully saved to the album. 

  • If users want to go back and take another picture, they can click the ‘arrow’ icon on the top left corner and return to the main screen page. Users will be able to take another picture.


  • When users want to identify short text (printed or handwritten) or documents, they can click the left icon, representing this function. They will hear audio guidance ‘Text’ when they click on the right button. 

  • Afterward, users take their phones near some text. The audio speaks as soon as the text appears in front of the camera. If the user wants to identify a printed document, audio guidance will guide users on how to take a photo with margins included.

  • For low vision users, a magnified written text will appear on the screen for them to see clearly.


  • When users want to scan and identify a barcode or QR code, they can click the icon on the right which represents this function. They will hear audio guidance ‘Barcode’ when they click on the right button. 

  • Afterward, users can hold the object with a barcode in one hand and hold the phone, on the other hand, facing the object.

  • A ‘beep’ sound will be heard if users are close to the barcode, the faster the beep, the closer you are. 

  • If users do not hear the ‘beep’ sound, slowly rotate the object or turn to different sides depends on the shape of the object until hearing the beep sound.

  • Once hearing audio speaking ‘processing’, it means the barcode has been successfully captured. Then users will hear the description of that object. A written description will also be shown on screen for low vision users to see

Information Icon (Quick Help)

  • We put the little ‘circled question mark’ icon button on each page's top right corner. When users click on this icon, users will hear audio speaking ‘quick help’, the introduction of this function will be shown on the screen afterward. 

  • We added a new feature to the quick help page: a ‘loudspeaker’ button is put under the description paragraph. We improved on similar functions compared with Seeing Ai, where it is not a separate icon for playing the audio description. If blind people turn on voiceover on iPhone, they will hear the paragraph's automatic speaking reading; however, if low vision users did not turn on voiceover, they can click on the ‘loudspeaker’ button to hear the audio description. 

  • Considering human factor principles, this new feature will help more with low vision users if they want to listen to the audio voice while reading through.

Menu (Browsing photos, help, feedback, setting and about)

  • We put one ‘menu’ button at the top left corner of the main screen. When users click on it, a second screen will appear below, which shows five separate icons, each with a specific function.

  • When users want to identify the past photos, they can click on the ‘browsing photos’ icon by following the audio guidance; then they will able to see past photos taken in the album.

  • Users can then select which photo they want the app to recognize, afterwards, both written description and audio speaking will appear.

  • This is the place where users have questions regards to functions in this app. Users will receive real-time responses.

  • After users click the ‘Help’ button, a screen will appear where users can type their questions and receive immediate automatic responses. We also added the feature where users do not have to type their questions but to speak out questions by pressing the ‘Hold to talk’ button at the bottom of the screen. This feature aids blind or low vision users to ask questions since typing words might be difficult.

  • This system containing all the frequently asked questions which might be countered by users. We believe this auto-response function will help solve users’ ongoing problems.

  • Moreover, if users encounter problems where auto-response cannot solve, they can call our assistant at 123-456-7890

  • We want to have this feedback page because user feedback will greatly help with our future improvement. When users type in feedback, the audio will also guide them to type names or contact information. 

  • After they finished the feedback, they can click the ‘submit’ button, and our assistant will receive this feedback.

  • Setting:

  • Users can set different preferences within this function, such as currency, voice speed, reorder channels, and languages.

  • About: 

  • This is the page where introduces our app.

  • AI VISION is a free application on the IOS platform. Designed for the blind and low vision community, using the power of Artificial intelligence to open up the visual world for them. This app is optimized for use with Voiceover for guidance; it enables users to recognize text, document, people, objects, scenes, currency, color, barcode, QR code, and photos in an album. 

Future Improvement

 Adding facial recognition and object recognition.

  1. When users try to identify an object that does not exist in the database, they will add information into their own database for future use.
  2. When users try to identify a person, they can add the name of the person in their database for future use.

Adding Siri data connection with specific features.

  1. Adding easy Siri shortcuts that people who used voiceover can easily open the specific function in applications instead of swiping each page and finding the correct buttons. 

Adding phone vibration interaction.

  1. Adding phone vibration interaction during the process that gives them a better experience since people have vision problems will have better perception in touch.
© 2020 Xiaoyun Chen, All Rights Reserved