Analyzing Catastrophic Backtracking Behavior in Practical Regular Expression Matching [PDF]
We develop a formal perspective on how regular expression matching works in Java, a popular representative of the category of regex-directed matching engines.
Martin Berglund+2 more
doaj +6 more sources
Text Indexing for Regular Expression Matching
Finding substrings of a text T that match a regular expression p is a fundamental problem. Despite being the subject of extensive research, no solution with a time complexity significantly better than O(|T||p|) has been found.
Daniel Gibney, Sharma V. Thankachan
doaj +2 more sources
Regular Expression Matching and Operational Semantics [PDF]
Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the ...
Asiri Rathnayake, Hayo Thielecke
doaj +4 more sources
Sketch-Driven Regular Expression Generation from Natural Language and Examples [PDF]
Recent systems for converting natural language descriptions into regular expressions (regexes) have achieved some success, but typically deal with short, formulaic text and can only produce simple regexes. Real-world regexes are complex, hard to describe
Xi Ye+4 more
doaj +2 more sources
Which Regular Expression Patterns are Hard to Match? [PDF]
Regular expressions constitute a fundamental notion in formal language theory and are frequently used in computer science to define search patterns. A classic algorithm for these problems constructs and simulates a non-deterministic finite automaton ...
Backurs, Arturs, Indyk, Piotr
core +2 more sources
Exploring efficient grouping algorithms in regular expression matching. [PDF]
BACKGROUND:Regular expression matching (REM) is widely employed as the major tool for deep packet inspection (DPI) applications. For automatic processing, the regular expression patterns need to be converted to a deterministic finite automata (DFA ...
Chengcheng Xu, Jinshu Su, Shuhui Chen
doaj +2 more sources
Regular Expression Based Medical Text Classification Using Constructive Heuristic Approach
Medical text classification assigns medical related text into different categories such as topics or disease types. Machine learning based techniques have been widely used to perform such tasks despite the obvious drawback in such “black box ...
Menglin Cui+5 more
doaj +2 more sources
Data Extraction via Semantic Regular Expression Synthesis [PDF]
Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text.
Qiaochu Chen+4 more
semanticscholar +1 more source
Effective Filter for Common Injection Attacks in Online Web Applications
Injection attacks against web applications are still frequent, and organizations like OWASP places them within the Top Ten of security risks to web applications. The main goal of this work is to contribute to the community with the design of an effective
Santiago Ibarra-Fiallos+5 more
doaj +1 more source
HEDEA: A Python Tool for Extracting and Analysing Semi-structured Information from Medical Records [PDF]
ObjectivesOne of the most important functions for a medical practitioner while treating a patient is to study the patient's complete medical history by going through all records, from test results to doctor's notes.
Anshul Aggarwal+2 more
doaj +1 more source