Implement semantic segmentation with npm

dannadori
4 min readJun 23, 2020

--

The other day, I create a npm that does semantic segmentation of large images by dividing them up. I’ll try to explain how to use it by making a demo.

This package is designed for SemanticSegmentation, adjusting the trade-off between accuracy and response time on limited performance devices such as smartphones.

For this demonstration, we will create something like this Barcode detection and segmentation.

Here’s a post from the last time I created an npm package.

https://medium.com/@dannadori/how-to-create-an-npm-package-that-includes-a-webworker-c469b209819

Advance preparation

We’ll create a demo with React.
First, let’s set up the environment.

$ create-react-app demo --typescript

Install package

Install semantic segmentation package.

$ npm install scalable-semantic-segmentation-js
$ node node_modules/scalable-semantic-segmentation-js/bin/install_worker.js public
file is copied

Prepare Semantic Segmentation model

Next, we need to prepare a Semantic Segmentation model to be used.
This model assumes the tensor of the [batch, height, width, channels] shape as input.

$ ls public/WEB_MODEL/300x300_0.10/
group1-shard1of1.bin model.json

Source code

When you are ready, create the source code. Here, I will only explain the parts that I think are important. The entire source can be found in the following repositories.

In this demo, we will create an instance of the module’s class as a member variable of the React component.

scalableSS:ScalableSemanticSegmentation = new ScalableSemanticSegmentation()

Let’s start with componentDidMout.

componentDidMount() {
console.log('Initializing')

const initWorkerPromise = this.initWorker() // <-- (1)

if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) { // <-- (2)
const webCamPromise = navigator.mediaDevices
.getUserMedia({
audio: false,
video: DisplayConstraintOptions[this.state.videoResolution]
})
.then(stream => {
console.log(this.videoRef)
this.videoRef.current!.srcObject = stream; // <-- (3)
return new Promise((resolve, reject) => {
this.videoRef.current!.onloadedmetadata = () => {
resolve();
};
});
});

Promise.all([initWorkerPromise, webCamPromise])
.then((res) => {
console.log('Camera and model ready!')
})
.catch(error => {
console.error(error);
});
}
}

Initialize the instance that performs SemanticSegmentation in (1). The method is described later.

The video device is obtained in (2) and set to the source of HTMLVideoElement in (3).

Next, let’s look at the contents of “initWorker”.

async initWorker() {
// SemanticSegmentation
this.scalableSS.addInitializedListener(()=>{ // <-- (1-1)
const props = this.props as any
this.setState({initialized:true})
this.requestScanBarcode() // <-- (1-2)
})
this.scalableSS.addMaskPredictedListeners((maskBitmap:ImageBitmap)=>{// <-- (2-1)
// 再キャプチャ
this.requestScanBarcode() // <-- (2-2)

})

this.scalableSS.init(
AIConfig.SS_MODEL_PATH,
AIConfig.SPLIT_WIDTH,
AIConfig.SPLIT_HEIGHT,
AIConfig.SPLIT_MARGIN) // <-- (3)
return
}

This method initializes the instance that performs SemanticSegmentation.

First, in (1–1), we set up a callback function for when the initialization, such as loading the model used in the instance, is completed.

In the callback function, (1–2) calls a function that performs barcode scanning.

(2–1) sets up a callback function that will be called when the segmentation is complete. The parameters you receive are the bitmap image of the result of the segmentation. Among the callback functions, (2–2) calls the function that performs barcode scanning. This loops the process.

(3) to specify the information of the model used for the instance and the margin (the percentage of overlapping area between adjacent images after splitting) to be used when splitting the image. The first argument is the path of the model, the second and third arguments are the width and height to be used in the model, and the fourth argument is the margin.

Finally, requestScanBarcode.

requestScanBarcode = async () => {
console.log('requestScanBarcode')
const video = this.videoRef.current!
const controller = this.controllerCanvasRef.current!
controller.width = this.overlayWidth
controller.height = this.overlayHeight

const captureCanvas = captureVideoImageToCanvas(video) <--(1)
if(captureCanvas.width === 0){
captureCanvas.remove()
window.requestAnimationFrame(this.requestScanBarcode);
return
}
this.scalableSS.predict(captureCanvas,
this.state.colnum,
this.state.rownum) <--(2)
captureCanvas.remove()
}

Here, the image to be segmented is acquired and SemanticSegmention is performed.

In (1), the image which is the object of SemanticSegmentation is acquired from HTMLVideoElement. This image is given to the instance of the module as the argument of (2) to execute SemanticSegmentation. The second and third arguments are the number of rows and columns to be divided.

This completes the segmentation process.

For debugging purposes, you can also view the segmentation and information about the grid.

<Label basic size="tiny" color={this.state.showSS?"red":"grey"} onClick={()=>{
const newValue = !this.state.showSS
this.scalableSS.previewCanvas = newValue ? this.workerSSMaskMonitorCanvasRef.current! : null
this.setState({showSS:newValue})
}}>ss</Label>
<Label basic size="tiny" color={this.state.showGrid?"red":"grey"} onClick={()=>{
const newValue = !this.state.showGrid
this.scalableSS.girdDrawCanvas = newValue ? this.controllerCanvasRef.current! : null
this.setState({showGrid:!this.state.showGrid})
}}>grid</Label>

Draw to this.scalableSS.previewCanvas and this. Draw to this.scalableSS.previewCanvas and this.scalableSS.girdDrawCanvas, respectively. Set the HTMLCanvasElement.

Demo

Github repository and npm package

This source code is stored in the following repositories.

https://github.com/FLECT-DEV-TEAM/ScalableSemanticSegmentationjs_demo

The npm package page is at the following URL.

I am very thirsty!!

--

--