Authentication email has already been sent, please check your email box: and activate it as soon as possible.
You can login to My Profile and manage your email alerts.
If you haven’t received the email, please:
|
|
There are 322 papers published in subject: > since this site started. |
Select Subject |
Select/Unselect all | For Selected Papers |
Saved Papers
Please enter a name for this paper to be shown in your personalized Saved Papers list
|
1. Hierarchical Federated Learning with Gaussian Differential Privacy | |||
ZHOU Tao,PENG Hai-Peng | |||
Computer Science and Technology 28 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Federated learning is a privacy preserving machine learning technology. Each participant can build the model without disclosing the underlying data, and only shares the weight update and gradient information of the model with the server. However, a lot of work shows that the attackers can easily obtain the client's contributions and the relevant privacy training data from the public shared gradient, so the gradient exchange is no longer safe. In order to ensure the security of Federated learning, in the differential privacy method, noise is added to the model update to obscure the contribution of the client, thereby resisting member reasoning attacks, preventing malicious clients from knowing other client information, and ensuring private output. This paper proposes a new differential privacy aggregation scheme, which adopts a more fine-grained hierarchy update strategy. For the first time, the $f$-differential privacy ($f$-DP) method is used for the privacy analysis of federated aggregation. Adding Gaussian noise disturbance model update in order to protect the privacy of the client level. We prove that the $f$-DP differential privacy method improves the previous privacy analysis by experiments. It accurately captures the loss of privacy at every communication round in federal training, and overcome the problem of ensuring privacy at the cost of reducing model utility in most previous work. At the same time, it provides a federal model updating scheme with wider applicability and better utility. When enough users participate in federated learning, the client-level privacy guarantee is achieved while minimizing model loss. | |||
TO cite this article:ZHOU Tao,PENG Hai-Peng. Hierarchical Federated Learning with Gaussian Differential Privacy[OL].[28 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756405 |
2. OSnet:One-Shot Network for Video Inpainting | |||
Tan Rujian,Wang Jing | |||
Computer Science and Technology 28 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:video inpainting is a challenging task in the field of computer vision. Current methods are usually flow based methods in order to improve temporal coherence, and get better inpainting result. However, these networks are usually complex, and have less use value. In this paper, wo propose a quick video inpainting network based on endoder-decoder architecture, which optimizes network running time by enhance the ability of network feature extraction.We validate our approach on our video inpainting dataset based on DAVIS and Septuplet dataset. Experment results show that our method compares favorably against the mainstream algorithms, and has good inpainting capability. | |||
TO cite this article:Tan Rujian,Wang Jing. OSnet:One-Shot Network for Video Inpainting[OL].[28 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756327 |
3. No-reference video quality assessment based on human attention system for background replacement applications | |||
WANG Yinan,WANG Jing,SHEN Qiwei | |||
Computer Science and Technology 25 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:With the wide application of image and video background replacement in many scenes such as short video production and high-definition video conference, more and more background replacement algorithms and video creations are produced. But there are great differences in the quality of image and video after replacement. Evaluating the quality of image and video after background replacement has important guiding significance in industry and academia. In the background replacement scene, the factors affecting the video quality after replacement include the distorsion of video frames, inter frame jitter, composition and chroma harmony. Among them, the accuracy and quality of video frames is a very important evaluation dimension. In this paper, we proposes a deep learning algorithm based on visual attention mechanism to realize the accuracy quality assessment in the application of video background replacement. Firstly, the convolution neural network (CNN) is designed to extract the distortion feature, and then the spatial saliency feature and temporal motion feature are fused through the attention mechanism. Finally, the subjective perception of video accuracy by human vision is fitted to evaluate the perception accuracy of background replacement videos. | |||
TO cite this article:WANG Yinan,WANG Jing,SHEN Qiwei. No-reference video quality assessment based on human attention system for background replacement applications[OL].[25 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756403 |
4. Adaptive and Attention-joint Supervision for Weakly Supervised Segmentation | |||
Ma Yue,Wan Hongjiang | |||
Computer Science and Technology 23 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Image-level weakly supervised semantic segmentation is a great challenge to compensate for the missing mask labels. Methods based on image-level labels primarily use class activation map (CAM) to approximate the segmentation mask. In view of the pseudo masks only focus on the class-specific discriminative regions of objects, various methods are explored to expand pseudo masks to cover ground-truths. Contemporary methods tend to use a lower threshold to distinguish objects and backgrounds in order to adjust CAMs for higher object coverage. However, too many backgrounds are misclassified into pseudo masks and excessive noise is trained in downstream tasks. To surmount this crux, we propose our Adaptive and Attention-joint Supervision method (AAJS). AAJS divides classification network into two branches and adds the trend of expansion and convergence to the two branches respectively for their class-specific features, i.e. regions activated by CAM. Each branch is adaptive constrained by another branch based on the confidence of the features. Then, the class-specific features are enriched so as to obtain a more accurate CAM at a lower threshold. Moreover, we propose an adaptive feature dropout method to prevent the classification network from relying too much on discriminative regions. AAJS are based on the experiments evaluated on PASCAL VOC 2012 and matches or exceeds the state-of-the-art performance compare to existing methods. | |||
TO cite this article:Ma Yue,Wan Hongjiang. Adaptive and Attention-joint Supervision for Weakly Supervised Segmentation[OL].[23 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756323 |
5. Research On Enhancing Video Temporal Consistency | |||
Meng Junfeng,Wang Jing | |||
Computer Science and Technology 16 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:In the secondary production of short videos, the phenomenon of pixel flickering occurs when the image algorithm is directly applied to the video, which is called poor temporal consistency. To solve this problem, this paper treats the video temporal consistency problem as a learning task and proposes a network architecture based on optical flow constraints. This is a post-processing technology that inputs the pre-processed video and the original video into the network, and outputs a video with stable temporal. Since the post-processing algorithm has nothing to do with the specific image algorithm, it has certain versatility in pixel flickering scenes, such as: video stylization, dehazing, super-resolution, coloring, etc. The network consists of a temporal stabilization module called TCNet, a loss calculation module and an optical flow constraint module. The temporal stabilization module introduces ConvGRU, which effectively improves the ability to extract temporal information and enhances the memory capacity of the network. For the training of the model, this paper also proposes a new hybrid loss, which is a weighted summation of the temporal loss and the spatial loss. The algorithm only calculates the optical flow in the training process, and does not need optical flow in the prediction process, which effectively guarantees the real-time requirements of the algorithm landing. After a large number of experimental verifications, TCNet has greatly improved its ability to enhance temporal consistency, while taking into account the perceived similarity. Among the existing algorithms, TCNet has the best performance. | |||
TO cite this article:Meng Junfeng,Wang Jing. Research On Enhancing Video Temporal Consistency[OL].[16 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756324 |
6. Multi-scale convolutional recurrent neural network for monaural speech enhancement | |||
Shibo Wei,Ting Jiang | |||
Computer Science and Technology 14 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Convolutional recurrent neural network (CRN) has achieved much success in the field of speech enhancement. The common processing method is to first use the convolution layer to downsample and compress the speech, and then use LSTM to conduct sequence modeling for the compressed features, to integrate the global information of the speech sequence. Finally, upsampling recovery is performed by the deconvolution layer. However, downsampling may cause the loss of speech feature information, so this paper proposes a multi-scale convolutional recurrent network model (MS-CRN).In this model, LSTM with residual connection is used for sequential modeling of the down-sampled results of each layer, and the results are splicing with the output of the deconvolution layer as the input of the next deconvolution layer. The structure combines the sequence features of different scales to further utilize the global information of the speech. Experimental results show that, under the same conditions, the SMS-CRN can obtain higher scores in scale-invariant signal-to-noise ratio (SI-SDR), etc than the original CRN. | |||
TO cite this article:Shibo Wei,Ting Jiang. Multi-scale convolutional recurrent neural network for monaural speech enhancement[OL].[14 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756218 |
7. Double attention module for rapid multi-scale object detection | |||
LIANG Jiaqi,MA Yue | |||
Computer Science and Technology 27 January 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:For object detection, many methods based on deep con volutional neural networks have greatly improved the speed of detection while ensuring accuracy. However, numer ous studies still have the problem of inaccurate focus on multi-scale objects. They also cannot capture the insuf ficient feature because of lacking global context informa tion. In this paper, we start from these issues and propose an effective architecture called DoubleS-AM including Spa tial Pyramid Pooling Attention Module(SPP-AM) and Self Weight Attention Module(SW-AM), which aims to capture important information among the feature maps at two dif ferent levels with attention mechanism, including channel level and spatial-level modules. Specifically, the channel level module(SPP-AM) pays more attention to multi-scale objects adaptively via weighting the channels with different receptive field feature information, while the spatial-level module(SW-AM) captures the global context similarity dis tribution of deeper feature to enhance semantic information of the shallower feature via feature pyramid. Combining two level modules, we design the end-to-end training net works to emphasize useful information while generating re liable and rapid predictions. We conduct extensive exper iments in comparison to state-of-the-art baseline and have significantly improved the results of object detection. The mAP(Iou=0.5) of different networks we design on PASCAL VOC2007 have increased by 4%. On MS COCO2017, the mAP has increased by about 3% and APS of our networks have increased by 3%-5%, which means it has a significant effect on multi-scale detection, especially on small object detection. | |||
TO cite this article:LIANG Jiaqi,MA Yue. Double attention module for rapid multi-scale object detection[OL].[27 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4756196 |
8. Research on Multi-Object Tracking Algorithm Based on Deep Feature | |||
SUN Qing-Hong, DONG Yuan | |||
Computer Science and Technology 23 January 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:This paper proposes a practical multiple object tracking system based on DeepSort (Simple Online and Realtime Tracking with a Deep Association Metric). This system uses the tracking-by-detection method as the framework, which divides the optimization processes into two parts: detection and tracking. As for the overall performance of the algorithm, detection quality is a key element for it. According to the comparison in terms of speed and performance, suitable detectors are SDP-CRC and Yolov3 in multi-object tracking scenes. Additionally, this project has proposed a new pre-processing method based on NMS (non maximum suppression). This method has improved tracking performance by up to 3.5\% in terms of MOTA. As for tracking components, this project replaces a new estimation method based on visual object trackers (SiamRPN) with traditional Kalman motion estimation method. The experimental results show that this method has improved MOTA (Multiple Object Tracking Accuracy) by 0.3\%. Besides, the speed performance has decreased nearly 500\%. That’s because this experiment should be carried out more finely. In the next step, the performance of this method can be improved by updating the template frame appropriately, using appearance information provided by SiamRPN tracker for matching process and improving the robustness of the tracker. Additionally, this project has optimized detection-to-tracker association algorithm by using their positional and velocity relationship. It has reduced the complexity of the algorithm by avoiding multi-dimensional matrix operations. | |||
TO cite this article:SUN Qing-Hong, DONG Yuan. Research on Multi-Object Tracking Algorithm Based on Deep Feature[OL].[23 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4756184 |
9. Ship Detection with Optical Image Based on Attention and Loss Improved YOLO v3 | |||
Yao Chen,Yuanyuan Qiao | |||
Computer Science and Technology 20 January 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Object detection is a critical research topic in computer vision, and its research results have been widely used in recent years. As a subtask of object detection, Ship detection has important research significance. Most ship detection research is based on SAR (Synthetic Aperture Radar) images in the existing research. However, the imaging method of SAR image is different from that of optical image, and it is impossible to transfer the research results of SAR image to optical image. Compared with SAR images, optical images have more image feature information, which can assist the algorithm to better learn ship features. In addition, the research of optical image ship detection has more important commercial value. Companies only need to equip a simple optical camera to complete the ship detection, without the need for a valuable device like radar, which is more reusable. This paper conducts optical image ship detection experiments, applies the YOLO v3 algorithm to the ship detection task, introduces the attention mechanism to transform the residual block in the DarkNet-53 network, and achieves more excellent performance. At the same time,this paper optimizes the recently proposed CIoU loss function which is better than the ln-norm loss function and presents the AIoU loss function on this basis. By increasing the area penalty term, the AIoU loss improves performance while converging speed compared with CIoU loss. It is applied to ship detection in the optical image and compared with GIoU, DIoU, and CIoU loss functions, and it has achieved better results than them in the bounding box regression of ship detection. | |||
TO cite this article:Yao Chen,Yuanyuan Qiao. Ship Detection with Optical Image Based on Attention and Loss Improved YOLO v3[OL].[20 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4756133 |
10. Research on Single-channel Target Speech Extraction Using μ-Law Algorithm | |||
Linhui Qiu,Jing Wang | |||
Computer Science and Technology 17 January 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:When faced with a mixed audio, people are often interested in the voice of one person instead of all the voices. Speaker extraction is precisely for this situation, extracting the voice of the target speaker in a multi-speaker environment, in order to imitate human selective auditory attention. In previous models, extraction is generally performed in the frequency domain, and the time domain signal is reconstructed according to the extracted amplitude and estimated phase spectrum. But this method will be affected by the phase estimation when reconstructing the time domain signal. With reference to Conv-TasNet and SpEx, this paper uses a speaker extraction network (F3S) in the time domain. The network converts the mixed speech into embedding coefficients instead of decomposing the speech signal into amplitude and phase spectra, thereby avoiding phase estimation problems. On this basis, this paper uses the μ-law compression and expansion algorithm to process the data, aiming to improve the training speed and extraction speed. The experimental results show that the F3S network in this paper can effectively improve the training speed and extraction speed while maintaining a good scale-invariant SDR (SI-SDR). | |||
TO cite this article:Linhui Qiu,Jing Wang. Research on Single-channel Target Speech Extraction Using μ-Law Algorithm[OL].[17 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4756081 |
Select/Unselect all | For Selected Papers |
Saved Papers
Please enter a name for this paper to be shown in your personalized Saved Papers list
|
|
About Sciencepaper Online | Privacy Policy | Terms & Conditions | Contact Us
© 2003-2012 Sciencepaper Online. unless otherwise stated