FBNetV3
Architecture that was generated using NARS.
FBNetV3 makes up a family of state-of-art compact neural networks that is generated through Network Architecture Recipe Search, NARS. NARS is an advanced version of Network Architecture Search that searches for both the architecture and the training recipes. FBNetV3 has been shown to improve the mAP (mean Average Precision).
Network Architecture Search is the technique of automating the design of Artificial Neural Network.

Parameters

Backbone Network

Many image segmentation or object detection task use feature extraction and use of regional proposals as it was proven to be more cost effective. Therefore, FBNetV3 has similar backbone network at the beginning to extract such features.

Pooler Resolution (Mask)

It is the size to pool proposals before feeding them to the mask predictor, in model playground the default value is set as 14.

Pooler Resolution (Box)

It is the size to pool proposals before feeding them to the mask predictor, in model playground the default value is set as 6.

Weights

Before the training process, the weights in the neural network has to be initialized to a certain values. The users will initialize the weights to FBNetV3a-DSMask-C4 COCO.

IOU Threshold

The IOU threshold is used to decide whether the bounding box contains a background or an object.
Everything above the value of the upper bound will be classified as objects and everything lower than the lower bound will be classified as background. The values in between the lower and the upper bound are ignored.

Normalization method

Normalization techniques help to decrease the overall training time of the model. It makes the contribution of the features uniform by normalizing the weights. This also helps to avoid the weights from exploding and hence makes the optimization faster.
There are three available normalization method in model playground:
    GN
    SyncBN
    naiveSyncBN

SyncBN

In this normalization techniques, where the weights are scaled and shifted by the variance and the mean. Mathematically, it is given as,
x^=xโˆ’E(x)Var(x)โˆ’ฯตy=ฮณโ‹…x^+ฮฒ\hat{x}=\frac{x-E(x)}{\sqrt{Var(x)}-\epsilon}\\ y=\gamma \cdot \hat{x}+\beta
The mean and standard-deviation are calculated per-dimension over all mini-batches of the same process groups. Later again, the scaling and shifting happens with other two constants:
ฮณ\gamma
and
ฮฒ\beta
. These are hyperparameters, and are usually learnable through the network.

NaiveSyncBN

In this normalization technique the weights are assigned equally to all the images regardless of their dimension. With this, we reduce the need to accurately compute mean and variance for each of the batches. A little difference has been observed between such simplified calculation and accurate mean and variance calculation.

GN

Group Batch normalization, abbreviated as GN, is another normalization technique that normalizes a group of parameters. If the input dimension is 50, them the GN normalization can group those 50 parameters in a group of 5, and normalize each group with its own mean and variance.

Pre NMS number of proposals

It is the maximum number of the proposals to be considered before the non maximal suppression. The proposals are sorted descending after confidence and only the ones with the highest confidence are chosen.

Post NMS number of proposals

It is the maximum number of proposals to be considered after the non maximal suppression. The probability of detecting more objects is high if this number is high but the computation cost is also increased since more regional proposals has to be processed.

Code Implementation

1
import urllib
2
โ€‹
3
import torch
4
from mobile_cv.model_zoo.models.fbnet_v2 import fbnet
5
from mobile_cv.model_zoo.models.preprocess import get_preprocess
6
from PIL import Image
7
โ€‹
8
โ€‹
9
def _get_input():
10
# Download an example image from the pytorch website
11
url, filename = (
12
"https://github.com/pytorch/hub/raw/master/dog.jpg",
13
"dog.jpg",
14
)
15
local_filename, headers = urllib.request.urlretrieve(url, filename)
16
input_image = Image.open(local_filename)
17
return input_image
18
โ€‹
19
โ€‹
20
def run_fbnet_v2():
21
# fbnet models, supported models could be found in
22
# mobile_cv/model_zoo/models/model_info/fbnet_v2/*.json
23
model_name = "dmasking_l3"
24
โ€‹
25
# load model
26
model = fbnet(model_name, pretrained=True)
27
model.eval()
28
preprocess = get_preprocess(model.arch_def.get("input_size", 224))
29
โ€‹
30
# load and process input
31
input_image = _get_input()
32
input_tensor = preprocess(input_image)
33
input_batch = input_tensor.unsqueeze(0)
34
โ€‹
35
# run model
36
with torch.no_grad():
37
output = model(input_batch)
38
output_softmax = torch.nn.functional.softmax(output[0], dim=0)
39
print(output_softmax.max(0))
40
โ€‹
41
โ€‹
42
if __name__ == "__main__":
43
run_fbnet_v2()
Copied!
Last modified 1mo ago