Authentication email has already been sent, please check your email box: and activate it as soon as possible.
You can login to My Profile and manage your email alerts.
If you haven’t received the email, please:
|
|
There are 40 papers published in subject: > since this site started. |
Select Subject |
Select/Unselect all | For Selected Papers |
Saved Papers
Please enter a name for this paper to be shown in your personalized Saved Papers list
|
1. Convolutional Neural Network Based on Optical Flow for Deepfake Detection | |||
Yang Piaoyang,Gao Yuanyuan | |||
Computer Science and Technology 16 May 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:With the development and popularization of communications technology, image and video play a more and more important role in the media, and the harm of image and video forgery is becoming more and more intense. Especially for the deeply forged video of human face, because the forgery effect of this kind of video is fine, it has strong deception and does great harm to the social credit system. The research on the detection of deep forged video has attracted the attention of scholars all over the world. The methods used can be divided into traditional methods and methods based on deep learning. Traditional methods have poor identification effect on fine forged videos or need manual participation. The method based on deep learning has the disadvantages of poor interpretability and insufficient generalization because it relies too much on data sets. In this paper, an identification model based on optical flow is proposed and tested on public data sets, which has achieved excellent results. The structure of the model proposed in this paper is simple, and the clues of identification basis are easier to understand. Experiments show that the model proposed in this paper has better interpretability and generalization. | |||
TO cite this article:Yang Piaoyang,Gao Yuanyuan. Convolutional Neural Network Based on Optical Flow for Deepfake Detection[OL].[16 May 2022] http://en.paper.edu.cn/en_releasepaper/content/4757774 |
2. Using Graph Sampling and Aggregation to Refine Speaker Embeddings in Speaker Diarization | |||
HE Shuyi,WANG Lei | |||
Computer Science and Technology 18 March 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:At present, deep neural networks are often used to extract speaker embeddings, such as x-vector and d-vector, and combine the speaker embeddings with clustering to implement a speaker segmentation system. The robustness of the speaker embedding determines the performance of the speaker segmentation system. Recently, ECAPA-TDNN embeddings have shown better performance than x-vector in speaker classification systems. In the work of this paper, the embedding extracted from each session is converted into a graph, and the embedding is used as a node of the graph, and two points whose similarity is greater than a set threshold are connected. Sampling and aggregating features from the local neighborhood of each node in the graph, using the structural information in the graph to reconstruct new speaker embeddings for each session through supervised learning. This embedding is then used for speaker segmentation using spectral clustering. The system proposed in this paper achieves the state-of-the-art results on the AMI dataset. | |||
TO cite this article:HE Shuyi,WANG Lei. Using Graph Sampling and Aggregation to Refine Speaker Embeddings in Speaker Diarization[OL].[18 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756829 |
3. Privacy-Preserving Way Based on Hypothesis Testing in Federated Learning | |||
LI Hui-Zhen,LI Li-Xiang | |||
Computer Science and Technology 16 March 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Federated learning has set off a wave in the application field of deep learning since it was proposed in 2016. It is very advisable for modeling on separate and independent datasets that contain sensitive information, thus breaking the data islets faced by most technology companies. However, the combined contribution of differential privacy and federated learning is not satisfactory in current study. In this paper, we bring forward a new algorithm of federated learning where a relaxation of differential privacy termed f-differential privacy(f-DP) is utilized for detailed and rigorous privacy analysis to enhance privacy protection. f-differential privacy retains the explanation of the hypothesis testing and designs a trade-off function to establish a connection between the type I and the type II errors, thereby tracking privacy loss and measuring the privacy leakage more accurately. We prove that adopting such an approach can achieve tighter privacy constraints and a more effective privacy guarantee, when remains on a certain high accuracy shared model compared to the centralized differential privacy in previous frameworks of federated learning, so as to bring a better balance between data availability and private security to meet our expectations. | |||
TO cite this article:LI Hui-Zhen,LI Li-Xiang. Privacy-Preserving Way Based on Hypothesis Testing in Federated Learning[OL].[16 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756454 |
4. Session-based Recommendation with Self-Distillation Graph Neural Networks | |||
Yuming Wang,Siyang Zhang | |||
Computer Science and Technology 15 March 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Recommendation systems have become fundamental in e-commerce scenarios, and session-based recommendation plays an increasingly significant role in recommendation systems because of its flexibility and highly practical value. Although there have been some promising results in previous works, they are still insufficient to achieve superior recommendation performance due to the limited even noisy information involved in the next click in each session. To obtain more accurate predictive vectors without the misleading of potential noisy information, we propose Self-Distillation Graph Neural Networks to make full use of the valuable information in a session, which is termed as SD-GNN for brevity. Specifically, we employ the well-evaluated and flexible deep ensemble in deep learning as the teacher model, which assembles multiple randomly initialized GNNs in a simple way. Furthermore, we leverage the soft target distribution produced by the teacher model to train each GNN in the ensemble to achieve self-knowledge distillation. Our whole method is easily implementable and scalable due to the proposed Self-Distillation technique. Extensive experiments on two benchmark datasets verify that the proposed method (SD-GNN) significantly outperforms state-of-the-art baselines and shows powerful performance in the session-based recommendation. | |||
TO cite this article:Yuming Wang,Siyang Zhang. Session-based Recommendation with Self-Distillation Graph Neural Networks[OL].[15 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756481 |
5. Sound Event Detection of Weakly Labelled Data with Auxiliary Clustering Loss | |||
WANG Yi, WANG Lei | |||
Computer Science and Technology 11 March 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Sound event detection (SED) is a task to detect the onsets and offsets of sound events in an audio recording. Sound event detection of weakly labelled data only needs the event category label existing in the audio recording, and does not need their onset and offset times, which can significantly reduce the cost of labeling data. Weakly-labeled sound event detection system cannot accurately know the activity status of various events in each frame during the training process, so more time series information needs to be captured so that the predictions of frames with the same event are as similar as possible. Drawing inspiration from unsupervised deep clustering that allows the network to learn more similar feature representations for elements of the same class, this study proposes to jointly train neural networks with auxiliary clustering loss. The joint training method proposed in this paper improves the performance of multiple systems on the DCASE 2017 Task 4 dataset and achieves the state-of-the-art F1 score for sound event detection. | |||
TO cite this article:WANG Yi, WANG Lei. Sound Event Detection of Weakly Labelled Data with Auxiliary Clustering Loss[OL].[11 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756833 |
6. DSPNet: A Lightweight Dilated Spatial Pyramid Network for Semantic Segmentation of Point Clouds | |||
LIN Song-Nan,YANG Zhen | |||
Computer Science and Technology 10 March 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Point cloud is a sparse, irregular, and unstructured form of data. These properties of the point cloud make it impossible to efficiently obtain the uniform down-sampling of the point cloud, and it becomes difficult to capture the contextual information in the point cloud. High-cost downsampling methods and complex network layers design make existing deep networks can not directly applicable to large point clouds. %The RandLA-Net explored the use of random downsampling for the first time, and achieved a great improvement in efficiency. However, in order to compensate for the information loss caused by random downsampling, the network layer of RandLA-Net method requires a large number of parameters and a well-designed complex encoder layer to obtain semantic wider information, which leads to a large amount of model calculation and limits the scale of point cloud processing on the network.%Existing deep neural networks for point clouds often require a large number of parameters and well-designed complex encoder layers to obtain semantic information, which leads to the computationally heavy model and limits the scale of point clouds processing on the network. In this paper, we explored a new structure for point clouds semantic segmentation deep network. The proposed scalability module uses the idea of dilated convolution in 2D image processing to enlarge receptive fields without downsampling operation, which is computationally expensive for large-scale point clouds.In order to further optimize the performance of the network, we combine the dilated neighborhood with the spatial pyramid structure to replace the complex network layer in the existing methods. Dilated convolution reduces the information loss in the down-sampling operation and simplifies the complex network layers designed to compensate for the loss in the random downsampling. Combining the expanded neighborhood with the spatial pyramid structure to replace the complex network layer in the existing method can significantly reduce network parameters and improve the accuracy rate. %This efficient module can replace several deep complex decoder layers in a deep network for semantic information extraction, reducing network depth, which in turn greatly reduces network size and consumption. The proposed model DSPNet, with RandLA-Net as a backbone network, achieves more superior accuracy and efficiency. We tested accuracy on SemanticKITTI,and Semantic3D datasets. We also made experiments to compare our model with other state-of-the-art methods. The simulation results shows that compared to the baseline model, our method reduces the number of parameters by 59\% with similar accuracy, and has faster time and more inference points. | |||
TO cite this article:LIN Song-Nan,YANG Zhen. DSPNet: A Lightweight Dilated Spatial Pyramid Network for Semantic Segmentation of Point Clouds[OL].[10 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756478 |
7. Research on Federal Image Classification System Based on Hybrid Differential Privacy | |||
SHI Jiaxin,LI Lixiang | |||
Computer Science and Technology 08 March 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:In response to the limitation that data cannot be directly shared among various institutions in real scenarios, Google proposed federated learning in 2016. The innovation of federated learning is to provide a distributed deep learning architecture that can meet data privacy protection and enable data Thousands of participants or clients participate in the same specific deep learning model for iterative training. Then, although federated learning has protected data privacy to a certain extent, it has been proved by experiments that there is still a risk of being attacked in federated learning systems. Technology Federated Learning System. At the same time, according to different attack models, the clients in federated learning are divided into two categories: trusting the central server and distrusting the central server. By adding noise to the local gradient of the client and adding noise to the server centrally, the privacy of the system is guaranteed. Finally, a federated image classification system based on hybrid differential privacy is designed. | |||
TO cite this article:SHI Jiaxin,LI Lixiang. Research on Federal Image Classification System Based on Hybrid Differential Privacy[OL].[ 8 March 2022] http://en.paper.edu.cn/en_releasepaper/content/4756556 |
8. Research On Enhancing Video Temporal Consistency | |||
Meng Junfeng,Wang Jing | |||
Computer Science and Technology 16 February 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:In the secondary production of short videos, the phenomenon of pixel flickering occurs when the image algorithm is directly applied to the video, which is called poor temporal consistency. To solve this problem, this paper treats the video temporal consistency problem as a learning task and proposes a network architecture based on optical flow constraints. This is a post-processing technology that inputs the pre-processed video and the original video into the network, and outputs a video with stable temporal. Since the post-processing algorithm has nothing to do with the specific image algorithm, it has certain versatility in pixel flickering scenes, such as: video stylization, dehazing, super-resolution, coloring, etc. The network consists of a temporal stabilization module called TCNet, a loss calculation module and an optical flow constraint module. The temporal stabilization module introduces ConvGRU, which effectively improves the ability to extract temporal information and enhances the memory capacity of the network. For the training of the model, this paper also proposes a new hybrid loss, which is a weighted summation of the temporal loss and the spatial loss. The algorithm only calculates the optical flow in the training process, and does not need optical flow in the prediction process, which effectively guarantees the real-time requirements of the algorithm landing. After a large number of experimental verifications, TCNet has greatly improved its ability to enhance temporal consistency, while taking into account the perceived similarity. Among the existing algorithms, TCNet has the best performance. | |||
TO cite this article:Meng Junfeng,Wang Jing. Research On Enhancing Video Temporal Consistency[OL].[16 February 2022] http://en.paper.edu.cn/en_releasepaper/content/4756324 |
9. Research on Single-channel Target Speech Extraction Using μ-Law Algorithm | |||
Linhui Qiu,Jing Wang | |||
Computer Science and Technology 17 January 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:When faced with a mixed audio, people are often interested in the voice of one person instead of all the voices. Speaker extraction is precisely for this situation, extracting the voice of the target speaker in a multi-speaker environment, in order to imitate human selective auditory attention. In previous models, extraction is generally performed in the frequency domain, and the time domain signal is reconstructed according to the extracted amplitude and estimated phase spectrum. But this method will be affected by the phase estimation when reconstructing the time domain signal. With reference to Conv-TasNet and SpEx, this paper uses a speaker extraction network (F3S) in the time domain. The network converts the mixed speech into embedding coefficients instead of decomposing the speech signal into amplitude and phase spectra, thereby avoiding phase estimation problems. On this basis, this paper uses the μ-law compression and expansion algorithm to process the data, aiming to improve the training speed and extraction speed. The experimental results show that the F3S network in this paper can effectively improve the training speed and extraction speed while maintaining a good scale-invariant SDR (SI-SDR). | |||
TO cite this article:Linhui Qiu,Jing Wang. Research on Single-channel Target Speech Extraction Using μ-Law Algorithm[OL].[17 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4756081 |
10. Adaptive speech enhancement based on SNR perception in non-stationary noise scenarios | |||
Chen Zhishuai,Wang Jing | |||
Computer Science and Technology 14 January 2022 | |||
Show/Hide Abstract | Cite this paper︱Full-text: PDF (0 B) | |||
Abstract:Speech enhancement is always a hot topic in the field of speech processing. In real life, interference signals are usually non-stationary or even burst. Therefore, it is of great significance to study speech enhancement technology in non-stationary noise scene to solve practical problems. The speech enhancement algorithm for the following questions, first, is currently on the market more studies on speech enhancement based on the assumption of stationary noise, but in real life, the environment is rapidly changing, noise cannot be ideally unchanged, second, because of the non-stationary noise is hard to predict, in the case of low SNR, it is difficult to maintain the signal distortion after speech enhancement. Aiming at the above problems, this paper proposes an adaptive speech enhancement method based on SNR perception in non-stationary noise scenes to improve the adaptability of speech enhancement in complex non-stationary scenes. At the same time, the neural network is designed to calculate the signal-to-noise ratio and improve the processing capability of the model. | |||
TO cite this article:Chen Zhishuai,Wang Jing. Adaptive speech enhancement based on SNR perception in non-stationary noise scenarios[OL].[14 January 2022] http://en.paper.edu.cn/en_releasepaper/content/4756093 |
Select/Unselect all | For Selected Papers |
Saved Papers
Please enter a name for this paper to be shown in your personalized Saved Papers list
|
About Sciencepaper Online | Privacy Policy | Terms & Conditions | Contact Us
© 2003-2012 Sciencepaper Online. unless otherwise stated