Table Detection using Deep Learning

Step 1: Preprocessing Tables

We are using Luminoth(Tensorflow) as the backend. Luminoth only works on preprocessed(greyscale) images.  

Use the below script to convert images into grayscale images:

preprocess.py​

import osimport cv2import pandas as pdroot_dir = os.getcwd()file_list = ['train.csv', 'val.csv']image_source_dir = os.path.join(root_dir, 'data/images/')data_root = os.path.join(root_dir, 'data')for file in file_list:    image_target_dir = os.path.join(data_root, file.split(".")[0])    if not os.path.exists(image_target_dir):        os.mkdir(image_target_dir)    # read list of image files to process from file    image_list = pd.read_csv(os.path.join(data_root, file), header=None)[0]    print("Start preprocessing images")    for image in image_list:        # open image file        img = cv2.imread(os.path.join(image_source_dir, image))        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)        # perform transformations on image        b = cv2.distanceTransform(img, distanceType=cv2.DIST_L2, maskSize=5)        g = cv2.distanceTransform(img, distanceType=cv2.DIST_L1, maskSize=5)        r = cv2.distanceTransform(img, distanceType=cv2.DIST_C, maskSize=5)        # merge the transformed channels back to an image        transformed_image = cv2.merge((b, g, r))        target_file = os.path.join(image_target_dir, image)        print("Writing target file {}".format(target_file))        cv2.imwrite(target_file, transformed_image)print("Finished preprocessing images")

Step 2: Create TFRecords

To train our model we need to convert our preprocessed images into TFrecords.

run this script. create TFRecords

lumi dataset transform --type csv --data-dir data/ --output-dir tfdata/ --split train --split val --only-classes=table

Step 3: Training our Model

Start training the luminoth network
Run this script: start_traning.sh
 

lumi train -c config.yml

config.yml

train:  # Name used to identify the run. Data inside `job_dir` will be stored under  # `run_name`.  run_name: table-area-detection-0.1  # Base directory in which model checkpoints & summaries (for Tensorboard) will  # be saved.  job_dir: jobs/  save_checkpoint_secs: 10  save_summaries_secs: 10  # Number of epochs (complete dataset batches) to run.  num_epochs: 10dataset:  type: object_detection  # From which directory to read the dataset.  dir: tfdata/classes-table/  image_preprocessing:    min_size: 600    max_size: 1024  data_augmentation:    - flip:        left_right: True        up_down: True        prob: 0.5model:  type: fasterrcnn  network:    # Total number of classes to predict.    num_classes: 1

Facts

It can take more than 1 day(24 hours) to train.

You will see last output like this:

INFO:tensorflow:Saving checkpoints for 3371 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3371, file: b'9527_018.png', train_loss: 2.469784736633301, in 23.64s
INFO:tensorflow:Saving checkpoints for 3372 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3372, file: b'9526_017.png', train_loss: 2.6023592948913574, in 21.86s
INFO:tensorflow:Saving checkpoints for 3373 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3373, file: b'1742_157.png', train_loss: 2.5478856563568115, in 22.36s
INFO:tensorflow:Saving checkpoints for 3374 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3374, file: b'9528_061.png', train_loss: 3.118919849395752, in 21.22s
INFO:tensorflow:Saving checkpoints for 3375 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3375, file: b'9516_001.png', train_loss: 3.0582146644592285, in 21.71s
INFO:tensorflow:Saving checkpoints for 3376 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3376, file: b'9509_040.png', train_loss: 2.7756423950195312, in 22.40s
INFO:tensorflow:Saving checkpoints for 3377 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3377, file: b'9508_065.png', train_loss: 3.152759552001953, in 22.57s
INFO:tensorflow:Saving checkpoints for 3378 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3378, file: b'5008_029.png', train_loss: 2.618196725845337, in 21.45s
INFO:tensorflow:Saving checkpoints for 3379 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3379, file: b'9518_018.png', train_loss: 2.6546759605407715, in 21.48s
INFO:tensorflow:Saving checkpoints for 3380 into jobs/table-area-detection-0.1/model.ckpt.
INFO:tensorflow:step: 3380, file: b'9515_024.png', train_loss: 3.292630434036255, in 21.28s
INFO:tensorflow:finished training after 10 epoch limit

To minimize training loss, you can train 20 epochs also.

Step 4: store last checkpoint

you have to store last checkpoint. it will be used in the prediction step.

create_checkpoint.sh

lumi checkpoint create config.yml

Step 5: Prediction

You can use luminoth in two form:

  1. command line
  2. luminoth web server
  3. Using python

1. Command line:

Run this script on cmd:

lumi predict --checkpoint c2155084dca6 data/val/9541_023.png

Run this script to start Luminoth web server:

lumi server web --checkpoint c2155084dca6

3. Using Python:

Follow this Link:

https://luminoth.readthedocs.io/en/latest/tutorial/07-using-luminoth-from-python.html