Categories
Planning

How to scrape text from an image in chrome

Brady Gavin has been immersed in technology for 15 years and has written over 150 detailed tutorials and explainers. He’s covered everything from Windows 10 registry hacks to Chrome browser tips. Brady has a diploma in Computer Science from Camosun College in Victoria, BC. Read more.

How to scrape text from an image in chrome

Usually, you use Optical Character Recognition (OCR) software to extract text from an image. However, as of Google Chrome 76, you can use an experimental feature to scrape text from images without any additional software.

When you use OCR to detect text, it’s computationally expensive. However, hardware manufacturers have supported shape detection for quite some time.

Enter, Shape Detection API. It relies on hardware acceleration from the device it runs on. API is capable of barcode detection, such as QR codes, and face and text detection. You can read more about the project on the developer’s website, where he goes into detail about how API works. For more on text detection, check out the Web Incubator Community Group website.

To use this feature, you have to enable an experimental flag in Chrome. When you enable anything from chrome://flags , you use unfinished features that haven’t been tested on all devices and could misbehave. You’ll potentially run into a few bugs, so be careful when you play around with some of the available flags.

For this guide, we’re using a Windows PC, but everything should work identically on all other platforms, including mobile devices.

To get started, fire up Chrome, type chrome://flags into the Omnibox, press Enter, and then type “Experimental web platform” in the search bar.

How to scrape text from an image in chrome

Alternatively, you can paste chrome://flags/#enable-experimental-web-platform-features into the Omnibox, and then press Enter to go directly to the flag.

Next, click the drop-down box next to the “Experimental Web Platform” flag, and then click “Enabled.”

How to scrape text from an image in chrome

For changes to take effect, you must restart Chrome. Click the blue “Relaunch Now” button at the bottom of the page.

When Chrome relaunches, head to https://copy-image-text.glitch.me/ to upload the image with the text you want to extract. Click “Choose File.”

How to scrape text from an image in chrome

Select the image file from your computer and click “Open.”

How to scrape text from an image in chrome

Although you’re “uploading” an image to the site, you can use this tool offline, as well. As soon as you navigate to the site, all the resources are saved in the cache.

After the file uploads, click “Submit.”

How to scrape text from an image in chrome

The page reloads with the extracted text. You can now copy the text from the webpage and paste it into any text editor or word processor.

How to scrape text from an image in chrome

The feature is slightly buggy at this writing. As you can see in the image above, only about half the document was uploaded and scanned. However, these issues should be resolved in time.

Newly released version 5!

Data Miner is a Google Chrome Extension and Edge Browser Extension that helps you crawl and scrape data from web pages and into a CSV file or Excel spreadsheet.

Data Miner is a Google Chrome Extension and Edge Browser Extension that helps you crawl and scrape data from web pages and into a CSV file or Excel spreadsheet.

An Easy to Use tool to Automate Data Extraction

Intuitive User Interface and workflow

Data Miner has an intuitive UI to help you execute advance data extraction and web crawling.

With just a few clicks you can run any of the over 60,000 data extraction rules in the tool or create your own customized extraction rules to get only the data you need from a webpage.

Single page or automated scraping

Data Miner can scrape single page or crawl a site and extract data from multiple pages such as search results, product and prices, contacts information, emails, phone numbers and more. Then Data Miner converts the data scraped into a clean CSV or Microsoft Excel file format for your to download.

Data Miner comes with a rich set of features that help you extract any text on a page that you see in your browser. It can automatically click on button and links and follow sub pages and open up pop ups and scrape data from them.

New features in Data Miner 5.0

Quick and Simple Scraping

Scrape with one click.
Use 50,000+ free pre-made queries made for 15,000+ popular websites.

Streamlined workflow

Crawl URLs, perform pagination, and scrape a single page all in one place.

No coding Required

The new Easy Finder tool helps you find CSS selectors and create your own custom recipes

Secure Web crawling and Scraping

Safe and Secure to use

Data Miner behaves as if you were clicking on the page yourself in your own browser.

Scrape Without Worry

Data Miner is not a Bot.
You will not get blocked.

Keep Your Data Private

Data Miner never sells your data.
Data Miner never shares your data.

Data Miner is the most powerful scraper around

One Click Scraping

Use one of 50,000 publicly available extraction queries to extract data with one click.

Custom Scraping

Make custom extraction queries to scrape any data from any site.

Automate Scrapes

Run bulk scrape jobs base off a list of URLs.

Fast Table Scrapes

Extract basic table data
with right clicking on the page.

Pagination

Automatically click to the next page and scrape using Auto Pagination.

Form Filling Automation

Data Miner can automatically fill forms for you using a pre-filled CSV.

The web browser is a fundamental element in order to get the most out of our Internet connection. Whether we use Chrome, Firefox or Edge, we can install different add-ons to provide them with a greater number of functionalities. If we want to extract text from images, videos or PDF, we can use an extension like Copyfish which we are going to talk about today.

Sometimes we may need to extract text from images, either for storage, mailing or translation. Until now, the most feasible option is to rewrite the text by hand. Images can contain many elements such as photographs, graphics, diagrams, so the add-on for the browser that we are talking about today allows us to extract the text that is in them.

What is Copyfish

We are talking about an extension compatible with the browsers Google Chrome, Mozilla Firefox and Microsoft Edge with which we can convert the embedded text in any image captured on our screen into an editable format, without having to retype it. Thanks to this, it facilitates reuse in digital documents, emails and reports.

With only this add-on installed in the browser we can translate the text of an image or video quickly. Perhaps, when we have needed this work we have resorted to the Google translator or we have another extension that allows us to translate the text of a web page. However, Copyfish goes a step further, since it allows this task to be carried out when the text is in a different language than the rest.

In this way, once installed in our browser, we can translate the text shown in an image, subtitles or posters of a video, quickly. Its capture reader converts the text of any captured screen image into an editable format without the need to retype, making it easy to reuse.

Translate text and videos from the browser

To start using it, just add the extension to the browser. Later, every time we want to translate the text that is shown in an image or video, we must click on its icon that we will find on the right side, next to the address bar of the browser.

This will open a tool so that we can select the area where the text that we need to extract or translate appears, either from an image or from a video. Once we have selected the area, a pop-up window appears that shows us the extracted text.

From here we can copy the texts to paste them wherever we want by clicking on the “Copy to clipboard” button. We also have the possibility to recapture and translate by clicking on the “Recapture” button. This utility can help us to translate subtitles in video that we are seeing on YouTube, for example. To do this, we can click on the Google Translate icon to take us to this utility that will show us the translation of the text.

If we click on the gear-shaped button, a window appears with its menu of options . From here we can change the OCR Engine or replace the free plan with one of its payment options. We can also choose the language to translate and the font size of the text box. In addition, if we need it, at the bottom it also has keyboard shortcuts.

Conclusions.

Copyfish is an interesting tool that works better for extracting text from a captured image than as a translator, at least in its free version. It is easy to use although its functions are in English, the extraction process is quite fast. We can copy the content directly to the clipboard to be able to paste it where we want. We can also carry out conversions, based on the Google Translate translator to obtain their meaning, so sometimes they are not entirely correct since they are too literal.

Taking into account that it is a free complement, it can be interesting in certain situations where it is essential to copy texts regularly from both images and videos, since it is based on a screenshot. The use of OCR technology works well in most cases, although it depends to some extent on the visibility of the letters.

Copyfish free download

This free OCR software is distributed as a complement to our browser and can be downloaded for free for both Google Chrome, Mozilla Firefox and Microsoft Edge:

  • Google Chrome
  • Mozilla Firefox
  • Microsoft Edge

Along with the free OCR options, Copyfish offers the ability to use a Pro OCR service, which are available as a monthly subscription. These plans use more processing power, allowing for near-instant conversions with better results than the free standard version, and can even read handwriting.

  • Free plan : offers the standard OCR function offering support for 25 languages. It will be necessary to set the input and output language in order to carry out the translations.
  • Pro Edition : In addition to offering the standard OCR function, it has additional functions such as automatic language detection and translation improvements, supporting handwriting. In addition, it supports up to 89 languages. It is priced at $ 19.95 per month and offers a 7-day trial version.
  • Pro + Edition : it has all the features of the Pro Edition, to which it adds automatic translation, avoiding having to select the input and output languages. Its price is $ 29.95 per month and it has a 7-day free version.

Other alternatives

In the event that we are not satisfied with Copyfish, we propose other alternatives that we can take into account with OCR recognition.

ABBYY FineReader

It is an Optical Character Recognition (OCR) program, which provides us with text recognition precision and conversion capacity. Its use is quite intuitive and supports more than 190 languages, which makes it one of the options. It supports scanned paper, digital images, PDF, Word and Excel files. We can download a free trial version of ABBY FineReader from their website.

Tesseract

We are talking about a JavaScript library that allows us to obtain words translated into almost any language from images. Its source code is capable of reading a binary, gray or color image and will be in charge of outputting it in text format. It can also read uncompressed TIFF images and read compressed images. The program supports Unicode (UTF-8) and can recognize more than 100 languages. Supports various output formats such as plain text, OCR, PDF, text-only PDF, and TSV. It can be downloaded from this link to its page on GitHub .