Package com.gmt2001
Class PatternDetector
java.lang.Object
com.gmt2001.PatternDetector
Provides pattern matchers to JS, where Java RegEx is required
- Author:
- gmt2001
-
Method Summary
Modifier and TypeMethodDescriptionstatic StringReturns the link contained in the input string which matches the links regexReturns all links contained in the input string which matches the links regexstatic booleanhasAnyLinks(String str) Indicates if the input string matches the links regexstatic booleanhasIpLinks(String str) Indicates if the input string matches the links regex for theipcapture groupstatic booleanhasProtoLinks(String str) Indicates if the input string matches the links regex for theprotouricapture groupstatic booleanhasWebLinks(String str) Indicates if the input string matches the links regex for theweburicapture groupstatic MatcherlinksMatcher(String str) Provides aMatcherfor the links pattern against the input string
-
Method Details
-
linksMatcher
Provides aMatcherfor the links pattern against the input stringUse the top-level capture groups to determine which type of match was made
All matches are made using Unicode Case-Insensitive rules
The links pattern checks for any of the following:
- Any text followed by a full stop
.followed by a valid TLD, which looks like an HTTP, FTP, RTSP, or WS link- Unicode text is supported
- Non-ASCII TLDs match against both the punycode and Unicode representations, such as matching either
xn--vermgensberater-ctborvermögensberater - The scheme and port are matched optionally for the purposes of including them in the capture groups
- The path, query, and fragment are not matched or output in the capture groups
-
Capture Groups:
- weburi - the whole match of scheme and authority, such as
https://hello.example.com:25000- webscheme - the scheme, if present, such as
https - webauthority - the entire authority, such as
hello.examaple.com:25000- webdomain - the domain component, including subdomain, such as
hello.example - the TLD, such as
com- webtld - if spaces were not used around the dot and no known workarounds were detected
example.com - webworkaroundtld - if spaces were used around the dot, or another potential workaround to detection was used
example. com - Only some TLDs are covered by workaround detection
- webtld - if spaces were not used around the dot and no known workarounds were detected
- webport - the port, if present, such as
25000
- webdomain - the domain component, including subdomain, such as
- webscheme - the scheme, if present, such as
- weburi - the whole match of scheme and authority, such as
- IP addresses
- Capture Groups:
- ip - any type of IP address
- ipv4 - any sequence of base-10 numbers
[0-9]and full stops.that looks like a valid IPv4 address - ipv6 - any sequence of base-16 numbers
[0-9a-fA-F]and colons:that looks like an IPv6 address- Supports rules for removing leading
0in each group - Supports
:: - Will false-trigger on incorrect usage of colons, such as
2001:0db8::ff00::8329
- Supports rules for removing leading
- ipv4 - any sequence of base-10 numbers
- ip - any type of IP address
- Capture Groups:
- Any text that looks like a selected list of URIs used for other protocols, such as
skype:,magnet:, andmailto:- Capture Groups:
- protouri - the whole match of protocol URI, such as
magnet:?xt=urn:btih:c12fe1c06bba254a9dc9f519b335aa7c1367a88a- protoscheme - the scheme, such as
magnet - protourn - the urn or other data of the URI, such as
?xt=urn:btih:c12fe1c06bba254a9dc9f519b335aa7c1367a88a
- protoscheme - the scheme, such as
- protouri - the whole match of protocol URI, such as
- Capture Groups:
- All capture types will also include the following capture groups:
- path - the path, query, and fragment components, if present
- Parameters:
str- the string being tested- Returns:
- a
Matcherthat can be used to test if the string contains links
- Any text followed by a full stop
-
hasAnyLinks
Indicates if the input string matches the links regex- Parameters:
str- the string being tested- Returns:
trueif a link is detected- See Also:
-
hasWebLinks
Indicates if the input string matches the links regex for theweburicapture group- Parameters:
str- the string being tested- Returns:
trueif a link is detected in theweburicapture group- See Also:
-
hasIpLinks
Indicates if the input string matches the links regex for theipcapture group- Parameters:
str- the string being tested- Returns:
trueif a link is detected in theipcapture group- See Also:
-
hasProtoLinks
Indicates if the input string matches the links regex for theprotouricapture group- Parameters:
str- the string being tested- Returns:
trueif a link is detected in theprotouricapture group- See Also:
-
getLink
Returns the link contained in the input string which matches the links regexIf multiple links are present, only the first one returned by the
Matcheris returnedMatches against all link types
- Parameters:
str- the string being tested- Returns:
nullif no links were detected; otherwise, the first link returned by theMatcher- See Also:
-
getLinks
Returns all links contained in the input string which matches the links regexMatches against all link types
-