當(dāng)前位置：首頁 > news >正文

建設(shè)工程材料網(wǎng)站百度seo怎么優(yōu)化

news 2025/7/9 6:43:53

建設(shè)工程材料網(wǎng)站,百度seo怎么優(yōu)化,電商網(wǎng)站建設(shè)咨詢,微信網(wǎng)站開發(fā)設(shè)計yolos和DETR，除了yolos沒有卷積層以外，幾乎所有操作都一樣。 HF官方文檔因為目標(biāo)檢測模型，實際會輸出幾百幾千個“框”，所以損失函數(shù)計算比較復(fù)雜。損失函數(shù)為偶匹配損失 bipartite matching loss，參考此blog targe…

yolos和DETR，除了yolos沒有卷積層以外，幾乎所有操作都一樣。
HF官方文檔

因為目標(biāo)檢測模型，實際會輸出幾百幾千個“框”，所以損失函數(shù)計算比較復(fù)雜。損失函數(shù)為偶匹配損失 bipartite matching loss，參考此blog

target為class_label和box組成的字典。假設(shè)對于一張圖片，我們有5個target框。
num_detection_tokens為模型對一張圖最多可以產(chǎn)生的box的數(shù)量
簡單闡述loss計算流程

vit 模型，輸入經(jīng)過預(yù)處理的圖片，輸出最后隱含層狀態(tài)，大小為 [batchsize，seq_len，hidden_size]
取最后num_detection_tokens個token的隱藏狀態(tài)，變?yōu)?br /> [batchsize，num_detection_tokens，hidden_size]
由于輸出了num_detection_tokens個box，而target為5個box，所以需要進行一對一的匹配，
匹配過程：
1. 先計算3個cost矩陣，shape均為【num_detection_tokens，num_target_box】，矩陣元素代表loss，矩陣代表對所有pred和target之間兩兩計算一次loss。
2. 3個cost矩陣分別代表標(biāo)簽loss（交叉熵?fù)p失）、坐標(biāo)loss（表示一個框的4個值的L1損失）、GIoU loss（框與框之間計算GIoU）
3. 三個cost矩陣加權(quán)得到總體cost矩陣，大小為【num_detection_tokens，num_target_box】
4. 對此矩陣進行l(wèi)inear_sum_assignment操作，得到一個匹配，此匹配下cost最小（即cost矩陣中找到不同行且不同列的5個元素，這5個元素之和最小）。匹配表示為長度為min(num_detection_tokens，num_target_box)的索引對。本例長度為5。
根據(jù)此匹配，pred和target之間計算一次loss（本例中一共計算5次loss并求和），最重loss就是上面說的3種loss的加權(quán)和
其實還有兩種loss：
1. “cardinality” loss，表示輸出的num_detection_tokens個class_label中，class_label不為“無目標(biāo)”的個數(shù)，與num_target_box的個數(shù)，的L1 loss. 說白了就是，除了5個框有實際的class以外，其他框應(yīng)盡可能分類為“無目標(biāo)”，避免檢測出來目標(biāo)過多。但之一loss不產(chǎn)生梯度，僅僅用于評估。
2. mask loss:功能暫時不清楚

官方匹配函數(shù)，匈牙利算法

# Copied from transformers.models.detr.modeling_detr.DetrHungarianMatcher with Detr->Yolos
class YolosHungarianMatcher(nn.Module):"""This class computes an assignment between the targets and the predictions of the network.For efficiency reasons, the targets don't include the no_object. Because of this, in general, there are morepredictions than targets. In this case, we do a 1-to-1 matching of the best predictions, while the others areun-matched (and thus treated as non-objects).Args:class_cost:The relative weight of the classification error in the matching cost.bbox_cost:The relative weight of the L1 error of the bounding box coordinates in the matching cost.giou_cost:The relative weight of the giou loss of the bounding box in the matching cost."""def __init__(self, class_cost: float = 1, bbox_cost: float = 1, giou_cost: float = 1):super().__init__()requires_backends(self, ["scipy"])self.class_cost = class_costself.bbox_cost = bbox_costself.giou_cost = giou_costif class_cost == 0 and bbox_cost == 0 and giou_cost == 0:raise ValueError("All costs of the Matcher can't be 0")@torch.no_grad()def forward(self, outputs, targets):"""Args:outputs (`dict`):A dictionary that contains at least these entries:* "logits": Tensor of dim [batch_size, num_queries, num_classes] with the classification logits* "pred_boxes": Tensor of dim [batch_size, num_queries, 4] with the predicted box coordinates.targets (`List[dict]`):A list of targets (len(targets) = batch_size), where each target is a dict containing:* "class_labels": Tensor of dim [num_target_boxes] (where num_target_boxes is the number ofground-truthobjects in the target) containing the class labels* "boxes": Tensor of dim [num_target_boxes, 4] containing the target box coordinates.Returns:`List[Tuple]`: A list of size `batch_size`, containing tuples of (index_i, index_j) where:- index_i is the indices of the selected predictions (in order)- index_j is the indices of the corresponding selected targets (in order)For each batch element, it holds: len(index_i) = len(index_j) = min(num_queries, num_target_boxes)"""batch_size, num_queries = outputs["logits"].shape[:2]# We flatten to compute the cost matrices in a batchout_prob = outputs["logits"].flatten(0, 1).softmax(-1)  # [batch_size * num_queries, num_classes]out_bbox = outputs["pred_boxes"].flatten(0, 1)  # [batch_size * num_queries, 4]# Also concat the target labels and boxestarget_ids = torch.cat([v["class_labels"] for v in targets])target_bbox = torch.cat([v["boxes"] for v in targets])# Compute the classification cost. Contrary to the loss, we don't use the NLL,# but approximate it in 1 - proba[target class].# The 1 is a constant that doesn't change the matching, it can be ommitted.class_cost = -out_prob[:, target_ids]# Compute the L1 cost between boxesbbox_cost = torch.cdist(out_bbox, target_bbox, p=1)# Compute the giou cost between boxesgiou_cost = -generalized_box_iou(center_to_corners_format(out_bbox), center_to_corners_format(target_bbox))# Final cost matrixcost_matrix = self.bbox_cost * bbox_cost + self.class_cost * class_cost + self.giou_cost * giou_costcost_matrix = cost_matrix.view(batch_size, num_queries, -1).cpu()sizes = [len(v["boxes"]) for v in targets]indices = [linear_sum_assignment(c[i]) for i, c in enumerate(cost_matrix.split(sizes, -1))]return [(torch.as_tensor(i, dtype=torch.int64), torch.as_tensor(j, dtype=torch.int64)) for i, j in indices]

目標(biāo)檢測還有很多細(xì)節(jié)問題，以后更新

查看全文

http://www.risenshineclean.com/news/50107.html

中文亚洲精品无码_熟女乱子伦免费_人人超碰人人爱国产_亚洲熟妇女综合网

建設(shè)工程材料網(wǎng)站百度seo怎么優(yōu)化

相關(guān)文章：