One Article Review

Accueil - L'article:
Source AlienVault.webp AlienVault Blog
Identifiant 1502911
Date de publication 2020-01-21 14:00:00 (vue: 2020-01-21 15:02:48)
Titre FUD-free analysis: Natural language processing (NLP)
Texte If you follow me on Medium or Twitter, you may already be aware. Still, if you don’t (I assure you that you’re missing out), I have been researching several technologies in preparation for an OPSEC/Anti-OSINT tool that I am crafting. I am using this tool as a means to push myself harder to learn something new that I can apply professionally. I am also doing this to be able to make a positive difference in the world. Notably, I am explicitly trying to learn Machine Learning and Natural Language Processing (NLP) in Python and R. When we hear terms like Advanced Persistent, Next-Generation, Machine Learning, Artificial Intelligence (AI), Machine Learning (ML), Single Pane of Glass, etc. from a vendor, we typically think it’s hype or FUD. Talking about the vendor FUD phrases is ironic because my blog and podcast were called Advanced Persistent Security. Often, we are correct. I set off on the journey to learn about learning to build a tool, but also to understand the technologies. I like to stump salespeople from time to time. Also, if these are the wave of the future, there is no time like the present to get acquainted. So, NLP. What is it? In social engineering circles, it is Neuro-Linguistic Programming. Some (many, if not most) in the scientific community consider it pseudoscience. Regardless, it claims to be able to influence or manipulate people through non-verbal cues from the eyes or touching someone (cringe) or other means. That is not the NLP that I am working on learning. Natural Language Processing, the more scientific NLP, is a marriage of various disciplines: computer science, data science (including AI and ML), and linguistics. NLP allows libraries and code to read the language as it is written or spoken by humans (naturally, hence the name). When applying slang, pidgins, and dialects, it will “learn” to recognize and respond to them. Also adjacent to NLP is OCR or Optical Character Recognition. OCR is the means to read data from a document in a non-text format (i.e., pictures, PDF, or Word documents). Having the ability to read the data allows you to open a PDF with a script (perhaps written in Python) and read it, make sense of it, and act as scripted. Why is this important to InfoSec, and what do we do with it? We could use this in log analysis, network monitoring, analyzing phishing emails, and my personal favorite, OSINT, to name a few. Within log analysis, NLP could be applied to gain further intelligence from logs without writing ridiculously long regular expressions (REGEX) via “learning” the context of the data and what is being sought. This would likely be in parallel with some Machine Learning, but it is a start. From the ML perspective, it would probably need to utilize supervised or semi-supervised learning with online entry vice unsupervised or reinforcement learning. The online means that it would read the data more closely to real-time than by ingesting a defined dataset. The supervision of learning refers to telling the “machine” whether it was correct or not. In some instances of learning logs, unsupervised learning could be useful in determining indicators of compromise or adversarial TTPs based on log data in two sets: breached (event data) and non-breached data. Reinforcement training would be more applicable for tuning and improvement. Back to NLP, the same concepts apply in network monitoring as log analysis, except it would be network traffic and PCAPs being analyzed. PCAP analysis with NLP and ML may be better suited for analyzing a user’s beha
Envoyé Oui
Condensat “cast 2020 a global ability able about accounts accuracy acquainted act actor adjacent advanced adversarial allows already also analysis analysis: analyze analyzed analyzing another applicable applications applied apply applying are artificial aspects assisting assure attempting authorities aware back based because been behavior being better beyond blog both breached build business but called advanced can cases challenging character circles claims closely code combined community complete compromise computer concepts conclusion consider considering context controlling controls corpus correct could crafting cringe critical ctf cues cut data dataset defined degree depending determine determining dialects difference discipline disciplines: disney dkim dmarc doctorate document documents doing don’t done each email emails employee employees engineering entry etc event events  except explicitly exploit expressions eyes favorite february find follow format free from fud further future gain generation get glass hacking happily harder have having hear help hence home how however humans hype identify implementation important improvement including indicators influence infosec ingesting innovative insider instances intelligence investigators ironic it’s journey just keywords kit knows labs  language large latter learn learning libraries like like trace likely linguistic linguistics log logs long look looking lot machine make manipulate many marriage master’s may means member” might mining missing monitoring more most myself name natural naturally nature need network neuro new next nlp non not notably ocr off office often on medium or twitter online open operates opposed opsec/anti optical organization organizations orwellian osint other out over pane parallel patterns pcap pcaps pdf people perhaps persistent personal persons perspective phish phishing phrases pictures pidgins places plan planet podcast portion positive possible predictions preparation present probably processing professionally programming pseudoscience pursuing push python read real recognition recognize refers regardless regex regular reinforcement release remaining research researching reservations resources respectively respond ridiculously run missing salespeople same science scientific script scripted second security semi sense set sets: several single slang social some someone something sought spf spoken start stay stump subject’s successful such suited supervised supervision taken talking target’s technical technologies telling terms text than them then these think threat through thwart time tool touching traffic training try trying ttps tuning two typically understand unsupervised use used useful user’s using utilize various vendor verbal vice walmart wave websites what when whether who why will within without word working world would writing written you’re
Tags Tool Threat
Stories
Notes
Move


L'article ne semble pas avoir été repris aprés sa publication.


L'article ne semble pas avoir été repris sur un précédent.
My email: