I recently had the opportunity to dig into the OCR capabilities of UiPath for a client. I thought the results were enlightening, so I wanted to share them here.
UiPath has a drag-and-drop interface for building RPA bots. One of the built-in activities is called "Read PDF with OCR." Using it is as simple as dragging it into your flow and choosing one of the six OCR engines that UiPath has built integrations with.
About the OCR Engines
- Google OCR - This actually uses the open source Tesseract OCR Engine, so it is free to use. Also, this processing is done on the local machine where UiPath is running.
- Google Cloud OCR - This requires a Google Cloud API Key, which has a free trial. More details here.
- Microsoft OCR - This uses the MODI OCR Engine, which is also free to use, and the processing is done locally like Google OCR.
- Microsoft Cloud OCR - This uses the Microsoft Computer Vision API, which is also free to sign up for. Details here.
- Abbyy OCR - This requires you to install Abbyy FineReader on your local machine and purchase a license. Details here.
- Abbyy Cloud OCR - This requires a subscription. Details here.
- Google OCR - This is easy to use because it's built into UiPath, but I found that it is the slowest option. Multi-page documents can take upwards of 30 seconds to get through.
- Google Cloud OCR - This option is fast and accurate. I was able to create an API key and get it working with UiPath pretty quickly. I did have to turn on billing to get the API key to work, but they give you 1,000 pages per month for free, which was more than enough for what I was doing.
- Microsoft OCR - This is similar to the built-in Google OCR option in that it's free and easy to use. For some document types it was more accurate than Google OCR, but for others it was less accurate.
- Microsoft Cloud OCR - I signed up for an API key, but I got an error in UiPath when I tried to use it. Based on some research I've done, I'm not the only one having this issue. It might have to do with some changes to Microsoft's API that have not yet been reflected in UiPath.
- Abbyy OCR - I downloaded a free trial of Abbyy FineReader to the machine I was using, and I was very impressed. FineReader can do lots of cool things including converting a scanned PDF to a searchable PDF or other document formats. Unfortunately, I was not able to get this option working with UiPath, and I'm not sure why.
- Abbyy Cloud OCR - I signed up for a free trial of this service, and it was easy to get it working with UiPath. This option was fast and accurate like Google Cloud OCR. Abbyy really shines in the way it formats the output text -- the text is spaced out to mimic the spacing of the original PDF, whereas most of these options simply output the text line by line.
Unsurprisingly, the paid OCR engines performed the best, especially with scanned documents. None of the engines read low-quality scans perfectly, but the cloud options were closest. If OCR is a key part of your project, I recommend trying all of your available options for the specific document types you're working with to find the best option that works within your project budget.