Table Detection Using Deep Learning, Object Detection, Faster R-CNN

Tabulo

Tabulo is an open-source toolkit for computer vision. Currently, we support table detection, but we are aiming for much more. It is built in Python, using LuminothTensorFlow and Sonnet.

Table of Contents

  1. Installation Instructions
  2. Avaiable API’s
  3. Runnning Tabulo
  4. Runnning Tabulo As Service
  5. Supported models 
  6. Usage
  7. Working with pretrained Models
  8. Working with datasets
  9. Training
  10. LICENSE

1. Installation Instructions

Tabulo currently supports Python 2.7 and 3.4–3.6.

1.1 Pre-requisites

To use Tabulo, TensorFlow must be installed beforehand. If you want GPU support, you should install the GPU version of TensorFlow with pip install tensorflow-gpu, or else you can use the CPU version using pip install tensorflow.

1.2 Installing Tabulo

First, clone the repo on your machine and then install with pip:

git clone https://github.com/interviewBubble/Tabulo.gitcd tabulopip install -e .

1.3 Check that the installation worked

Simply run tabulo --help.

2. Avaiable API’s

  • localhost:5000/api/fasterrcnn/predict/ – To detect table in the image
  • localhost:5000/api/fasterrcnn/extract/ – Extract table content from detected tables

3. Runnning Tabulo

3.1 Running Tabulo as Web Server:

Running Tabulo

3.2 Example of Table Detection with Faster R-CNN By Tabulo:

Example of Table Detection with Faster R-CNN By Tabulo

3.3 Example of Table Data Extraction with tesseract By Tabulo:

Example of Table Data Extraction with tesseract By Tabulo

4. Runnning Tabulo As Service:

4.1 Using Curl command

curl -X POST   http://localhost:5000/api/fasterrcnn/predict/   -H 'Content-Type: application/x-www-form-urlencoded'   -H 'Postman-Token: 70478bd2-e1e8-442f-b0bf-ea5ecf7bf4d8'   -H 'cache-control: no-cache'   -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW'   -F image=@/path/to/image/page_8-min.jpg

4.2 With PostMan

Table Detection using Postman

5. Supported models

Currently, we support the following models:

We also provide pre-trained checkpoints for the above models trained on popular datasets such as COCO and Pascal.

6. Usage

There is one main command line interface which you can use with the tabulo command. Whenever you are confused on how you are supposed to do something just type:

tabulo --help or tabulo <subcommand> --help

and a list of available options with descriptions will show up.

7. Working with pretrained Models:

  • DOWNLOAD pretrained model from Google drive
  • Unzip and Copy downloaded luminoth folder inside luminoth/utils/pretrained_models folder
  • Hit this command to list all check points: tabulo checkpoint list
  • You will get output like this: 
  • Now run server using this command: tabulo sever web --checkpoint 6aac7a1e8a8e

8. Working with datasets

DataSet to train your custom model.

9. Training

See Training your own model to learn how to train locally or in Google Cloud.

10. LICENSE

Released under the BSD 3-Clause.


References