Technical Walk-Through of Our Development Process
Our goal is to create a 3x2 solenoid array that can display braille characters by pushing solenoids up and down to create dots. This solenoid array is connected to a Raspberry Pi, which in turn is connected to an ESP32CAM. The camera takes a picture of a page of text, then performs OCR (optical character recognition) to extract a string of text from the image. That string of text is converted to braille, which is displayed on the solenoid array by flashing each character for 1 second at a time. This device essentially allows for live-time conversion of any text into braille, which we hope will increase accessibility to books and the like.
Our initial idea was to design a text to braille converter, which a blind person could use by moving the device over a page of text to convert it into braille. The braille translation of the English text would then be represented via a series of up/down pins which the user could use to interpret the information. The device was designed to be a rectangular box that would use an internal camera to interpret and OCR text, which could then be translated into braille and displayed via a series of servo motors pushing up metal rods on the top of the box.
However, after consulting with Stuart Christhilf, who'd thought of a similar mechanism for his initial final project, we changed direction. He'd originally planned to create a dynamic clock to display the time using blocks of wood that could be pushed out or pulled back via servos. However, when building his project, he realized that fitting so many servos into such a small space was completely unfeasible and warned us against doing the same.
We then decided to use electromagnets for our pins, instead of servos. The pins themselves would be a small magnetic rod sitting on top of an electromagnet. The small electromagnet could be powered on and off via a microcontroller...
We then decided to use electromagnets as braille pins, instead of a servo. The pins themselves would be a small magnetic rod sitting on top of an electromagnet. The small electromagnet could be powered on and off via a microcontroller...
Although a large part of our project remains the same, we've changed some aspects of the design. Namely, we've decided to use a Raspberry Pi as a central controller and connect it to 5 separate ATTiny412 chips, which will each be responsible for controlling 6 electromagnets to represent 1 braille character.
Additionally, we decided to create an elevated case for the ESP32 camera so that the image would have a better angle and thus an easier time being processed for OCR, and so that more light could come into the camera lens from the unobstructed sides. We also implemented wireless data transmission from the ESP32 camera to the Raspberry Pi for processing.
Here is an updated system diagram which maps out all the parts of our project:
After doing research, we realized that having 30 solenoids would be unfeasible. Instead, we decided to scale our project down to just having 6 solenoids, as this would still accomplish the mission of displaying braille for a reader. We now flash each braille character for 1 second on the 6 solenoid array. This change allows us to better manage power budget and ensures a reliable final product.
The design process began with a rectangular prism to serve as the main body of the structure.
Next, the edges were filleted to create a rounded appearance.
A sketch was created on the top of the box with six circles. These 6 circles represent the holes for the metal pins that can pop up and down depending on what needs to be represented.
The circles were extruded downward as holes, creating the actual space where the pins will be placed.
After activating the virtual environment, we can install all of the library dependencies.
sudo pip install pytesseract
sudo pip install opencv-python
The following program was developed for the Raspberry Pi to run.
1import time
2import cv2
3import urllib.request
4import numpy as np
5import pytesseract
6
7url = 'http://10.12.28.193/capture'
8
9img_resp = urllib.request.urlopen(url)
10imgnp = np.array(bytearray(img_resp.read()), dtype=np.uint8)
11frame = cv2.imdecode(imgnp, -1)
12
13text = pytesseract.image_to_string(frame, config='--psm 7')
14
15print("Extracted Text:", text)
16time.sleep(1)
The design began with a shelled box structure.
The electronics proved to be one of the most challenging aspects of the project. Our main challenges revolved around understanding transistors, particularly dealing with power handling capabilities and inconsistent pinouts that were sometimes backwards or jumbled.
We first tried using a TIP120 transistor, but it couldn't handle the power requirements of the solenoid. After testing, we switched to an IRF520 MOSFET, which worked much better. However, there were still issues with the pinouts being different than what we expected.
After fixing the pinout issues, we created a simple test circuit with an Arduino Uno to verify that the MOSFET could properly control the solenoid. The test circuit consisted of:
Once we confirmed the circuit worked, our team designed a PCB that would hold six of these circuits - one for each solenoid in the braille array. The PCB design includes:
We had previously setup infrastructure to wirelessly transmit a command to capture an image from a Raspberry Pi to the ESP32CAM, along with sending the image data back over the network and saving it. The team created a WebSocket server to accept commands and then send the image data over HTTP back to the Raspberry Pi.
The ESP32CAM is pointed towards a paper with the words “Hello World!”. In the right side of the picture, the Raspberry Pi which is running the code is visible along with the display. Upon running the program on the Pi's terminal, the ESP32CAM takes a picture and transmits it to the Pi, which then uses tesseract to perform OCR on it and prints out the extracted text.
WebSocket connections are initiated through HTTP protocol, using an upgrade request from HTTP to WebSocket. This begins with a client sending a standard HTTP request that includes an “Upgrade: websocket” header and a “Connection: Upgrade” header to the server. The server then responds with an HTTP 101 status code, indicating that the protocol will change, thus establishing the WebSocket connection.
1// Add the WebSocket server code here
2// Include the connection handling and data transmission
To optimize computational power usage, we explored processing the image in base64. Although this approach didn't ultimately scale down the computing enough to run on a microcontroller, it led us to an interesting solution: using GPT4o's multimodal capabilities as an OCR engine to extract text from the base64 image. GPT4o proved to be much more accurate in OCR than pytesseract, making it the better choice.
1static esp_err_t jpg_base64_handler(httpd_req_t *req) {
2 camera_fb_t *fb = esp_camera_fb_get();
3 if (!fb) {
4 Serial.println("Camera capture failed");
5 httpd_resp_send_500(req);
6 return ESP_FAIL;
7 }
8
9 // Encode the frame in base64
10 String base64Image = base64::encode(fb->buf, fb->len);
11
12 // Send the base64 encoded image
13 httpd_resp_set_type(req, "text/plain");
14 esp_err_t res = httpd_resp_send(req, base64Image.c_str(), base64Image.length());
15
16 // Return the frame buffer
17 esp_camera_fb_return(fb);
18
19 return res;
20}
21
22 }
The Raspberry Pi sends a byte-encoded text string to the ATTiny1614. From there, the ATTiny1614 is responsible for interpreting and converting the received text into braille dot arrays, which it then shows on the 3x2 array.
1/*
2 Solenoid arrangement:
30 1
42 3
54 5
6*/
7
8int sols[6] = {0, 1, 2, 3, 9, 8}; // Define the pins connected to the solenoids
9
10// Define the Braille arrays
11int a[6] = {0, 1, 1, 1, 1, 1};
12int b[6] = {0, 1, 0, 1, 1, 1};
13int c[6] = {0, 0, 1, 1, 1, 1};
14// ... more letter definitions ...
15
16typedef struct {
17char character;
18int *braille_array;
19} BrailleMap;
20
21BrailleMap braille_dictionary[] = {
22{'a', a}, {'b', b}, {'c', c}, {'d', d}, {'e', e},
23{'f', f}, {'g', g}, {'h', h}, {'i', i}, {'j', j},
24// ... more mappings ...
25};
The assembly process began with outlining the general setup of the final project. Each MOSFET was secured to a corresponding battery pack and solenoid, with color-coded trigger and GND wires. The organization ensures that toggling solenoids 1 through 6 controls each solenoid in sequence.
While there are existing technologies on the market that can convert text to braille in real time, those are often expensive and not readily available to the public. Our goal with this project is to create a product that can be cheaply produced and reach a wide audience. Through this development process, we've learned several key lessons: