One Article Review

Accueil - L'article:
Source GoogleSec.webp GoogleSec
Identifiant 8418787
Date de publication 2023-11-29 12:00:03 (vue: 2023-12-03 11:07:06)
Titre Amélioration de la résilience et de l'efficacité de la classification du texte avec RETVE
Improving Text Classification Resilience and Efficiency with RETVec
Texte Elie Bursztein, Cybersecurity & AI Research Director, and Marina Zhang, Software EngineerSystems such as Gmail, YouTube and Google Play rely on text classification models to identify harmful content including phishing attacks, inappropriate comments, and scams. These types of texts are harder for machine learning models to classify because bad actors rely on adversarial text manipulations to actively attempt to evade the classifiers. For example, they will use homoglyphs, invisible characters, and keyword stuffing to bypass defenses. To help make text classifiers more robust and efficient, we\'ve developed a novel, multilingual text vectorizer called RETVec (Resilient & Efficient Text Vectorizer) that helps models achieve state-of-the-art classification performance and drastically reduces computational cost. Today, we\'re sharing how RETVec has been used to help protect Gmail inboxes.Strengthening the Gmail Spam Classifier with RETVecFigure 1. RETVec-based Gmail Spam filter improvements.
Envoyé Oui
Condensat  to 2023 ability abuse achieve achieves actively actors additionally adversarial against alexey all allowed allowing allows anti application applications architecture are art attacks attempt attorre augmentation available avigad bad based baseline battle because been benchmark benefitsretvec better between box brunno build bursztein bypass called can candidate cases character characters check classification classifier classifiers classify combining comments compact computation computational conducted content contributed converted cost costs created critical cybersecurity dan decreases defense defenses demo deployment deployments details detection developed device devices diagram different director drastically driven due edge effective efficiency efficient elie emails embedding encoder engineersystems equal evade evaluate evaluations every example exhibit extensively false faster figure filter found from gengxin get github givol gmail google googlers harder harmful has having help helps highly homoglyphs host how ideal identify implementation improve improvements improving inappropriate inboxes including inference inside invisible its jia keyword kurakin language large largest latency layer learning lidor lightweight like machine make making malicious manipulations manner marina melvin memory metric miao mobile model models montenegro more multilingual native need network neurips novel one open out over owen own page paper parameters particular past performance phishing play positive preprocessing previous project protect provide rate recent reduce reduced reduces regime rely replacing representation research resilience resilient result retvec retvecfigure rishabh robust running scale scams scratch seamlessly security server seth sharing side size smaller software source spam speed split sporting sreepati started state strengthening stuffing such tensorflow tensorflowjs tested text texts tflite thank these today tpu trained training transformer tutorial types typo upgrades usage use used usefulness uses using utf vallis vectorizer vectorizers venkat very web which who will without word works would xinyu year years your youtube zhang ~200k
Tags Spam Mobile
Stories
Notes ★★
Move


L'article ne semble pas avoir été repris aprés sa publication.


L'article ne semble pas avoir été repris sur un précédent.
My email: