GitHub - RyannKim327/image-to-text: A simple Tesseract wherein you can have the text from the image. Idea from: https://www.npmjs.com/package/text-from-image

Image to Text (Text Extractor)

MPOP Reverse II

How to install

npm install xtract-txt

Introduction

The OCR (Optical Character Recognition) Engine Mode is one part of this project from its first release. According to IBM, Optical character recognition (OCR) is sometimes referred to as text recognition. An OCR program extracts and repurposes data from scanned documents, camera images and image-only pdfs. OCR software singles out letters on the image, puts them into words and then puts the words into sentences, thus enabling access to and editing of the original content. It also eliminates the need for manual data entry. The Page segmentation mode defines how your text should be treated by Tesseract. For example, if your image contains a single character or a block of text, you want to specify the corresponding psm so that you can improve accuracy. According to David Sixela.

How to use (extractText) .extractText(image_path: string, options?: XTRACT_OPTIONS): Promise< XTRACT >

const extractText = require("xtract-txt")

let run = async () => {
	let output = await extractText("./sampleimg.png")
	// await scan("./sampleimg.png", 2, 3)
	// This is just optional
	console.log(output)
}

run()

Result

{
	"result": "Sample text"
}

Here are the parameters from the options?: XTRACT_OPTIONS parameter. All the keys are optional, meaning not require.

Key	Datatype
debugging	boolean
grayscale	boolean
contrast	number
normalized	boolean
posterized	number
ocr_engine_mode	OEM
pageseg_mode	PSM
save_image_path	file_path.extension

How to use (Add language) .addLanguage(language: string)

This feature is just optional, this package has already default installed languages which are english and the orientation and script detection (osd).

const extractText, {addLanguage, CEBUANO, FILIPINO, TAGALOG} = require("xtract-txt")

let run = async () => {
	addLanguage(CEBUANO)
	addLanguage(FILIPINO)
	addLanguage(TAGALOG)
	let output = await extractText("./sampleimg.png")
	console.log(output)
}

run()

Add language is still in development, so that this feature might not be stable. Try to add some try catch to handle this kind of error and to avoid some crash on to your system.

Language Lists

ARABIC
CEBUANO
CHINESE_SIMPLIFIED
CHINESE_TRADITIONAL
GERMAN
GREEK
FILIPINO
HEBREW
JAPANESE
KOREAN
TAGALOG

For more language, kindly visit this link, and use the key language to add.

Credits

Tesseract.js
cli-progress
ansi-colors
Jimp

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src		src
.gitignore		.gitignore
.nodeignore		.nodeignore
LICENSE.md		LICENSE.md
Readme.md		Readme.md
Untitled.png		Untitled.png
package-lock.json		package-lock.json
package.json		package.json
test.png		test.png
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image to Text (Text Extractor)

MPOP Reverse II

How to install

Introduction

How to use (extractText) .extractText(image_path: string, options?: XTRACT_OPTIONS): Promise< XTRACT >

Result

How to use (Add language) .addLanguage(language: string)

Language Lists

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

RyannKim327/image-to-text

Folders and files

Latest commit

History

Repository files navigation

Image to Text (Text Extractor)

MPOP Reverse II

How to install

Introduction

How to use (extractText) .extractText(image_path: string, options?: XTRACT_OPTIONS): Promise< XTRACT >

Result

How to use (Add language) .addLanguage(language: string)

Language Lists

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages