Regular Expression in C #

A regular expression is a pattern that can be matched to an input text. The .Net Framework provides a regular expression that allows matching like that. In C #, a pattern consists of one or more character constants, operators, or construct.

R expressionular is a pattern that can be matched to an input text. The .Net Framework provides a regular expression tool that allows such matching. In C #, a pattern consists of one or more character constants, operators, or construct.

There are many different types of characters, operators and constructs that help you define Regular Expressions in C #:

  1. Character escape
  2. Character class
  3. Anchor
  4. Grouping construct
  5. Quantifier
  6. Backreference construct
  7. Alternation construct
  8. Substitution
  9. Miscellaneous constructs

Construct to define Regular Expression in C #

  1. Character escape in C #
  2. Anchor in C #
  3. Grouping construct in C #
  4. Character class in C #
  5. Quantifier in C #
  6. Backreference construct in C #
  7. Alternation construct in C #
  8. Substitution in C #
  9. Construct in C #
  10. Regex class in C #

Character escape in C #

Basically, Escape Character in C # is special characters. The backslash character () in a regular expression indicates that the character follows it: either a special character or should be interpreted according to each character.

Here are the Escape Character in C #:

Escape character Description Pattern Match a match a bell character, u0007 a "u0007" in "Warning!" + 'u0007' b In a character class, match a backspace, u0008 [b] {3,} "bbbb" in "bbbb" t Match a tab, u0009 (w +) t "Namet", "Addrt" in "NametAddrt" r Compare a carriage return, u000D. (r is not equivalent to newline character (new line), n) rn (w +) "rnHello" in "rHellonWorld." v Match a vertical tab, u000B [v] {2,} "etc" in "etc" f Match a form feed, u000C [f] {2,} "fff" in "fff" n Match a newline ( new line), u000A rn (w +) "rnHello" in "rHellonWorld." e Match a escape, u001B e "x001B" in "x001B" nnn Use the representation of the base system 8 to determine a character (nnn consists of 3 digits) w40w "ab", "cd" in "a bc d " x nn Use the representation of the base system 16 to determine a character (nn consisting of 2 digits) wx20w" ab "," cd "in" a bc d " c Xc x Match the ASCII control character that is defined by X or x, with X or x being the letter of the cC control character "x0003" in "x0003" (Ctrl-C) u nnnn Matches a Unicode character by using hexadecimal representation (including 4 digits, as shown by nnnn) wu0020w "ab", "cd" in "a bc d" When followed by a character that is not recognized as an Escape Character, match that character + [+ -x *] d + d + [+ - x * d + "2 + 2" and "3 * 9" in "(2 + 2) * 3 * 9"

Anchor in C #

Anchoring allows a match to succeed or fail depending on the current position in the sequence. Below are the anchor in C #:

Assertion Description Match Pattern ^ The matching task must start at the beginning of the string or line. ^ d {3} "567" in "567-777-" $ Match must begin at the end of the string or before n at the end of the line or string. -d {4} $ "-2012" in "8-12-2012" A Match must start at the beginning of the string. Aw {3} "Code" in "Code-007-" Z Matches must start at the end of the string or before n at the end of the string. -d {3} Z "-007" in "Bond-901-007" z The match must start at the end of the string. -d {3} z "-333" in "-901-333" G Matches must start at the point where the previous match ends. G (d) "(1)", "(3)", "(5)" in "(1) (3) (5) [7] (9)" b Match must start on a middle limit one w (alphanumeric) and one W (not alphanumeric). w "R", "o", "m" and "1" in "Room # 1" B Match must not start on a limit b Bendw * b "ends", "ender" in "end" endure lender "

Grouping construct in C #

Grouping Construct in C # describes the sub-expressions of a Regular Expression and captures substring in a input string. The following table lists the Grouping Construct in C #:

Grouping construct DescriptionPattern Match (subexpression) Getting matched subexpression and assigning it a sequence number based on 0. (w) 1 "ee" in "deep" (? subexpression) Getting subexpression matched side in a named group. (? w) k "ee" in "deep" (? subexpression) Defines a balanced group definition. (((? 'Open' () [^ ()] *) + ((? 'Close-Open')) [^ ()] *) +) * (? (Open) (?!)) $ "( (1-3) * (3-1)) "in" 3 + 2 ^ ((1-3) * (3-1)) " (?: Subexpression ) Defines a noncapturing group. Write (?: Line)? "WriteLine" in "Console.WriteLine ()" (? Imnsx-imnsx: subexpression) Applies or disables the options specified inside the subexpression. Ad {2} (? I: w +) b "A12xl", "A12XL" in "A12xl A12XL a12xl" (? = Subexpression ) w + (? =.) "Is", "ran", and "out" in "He is the The ran dog The sun is out. " (?! subexpression) b (?! un) w + b "sure", "used" in "unsure sure unity used" (? <= subexpression) (? <= 19) d {2} b "51", " 03 "in" 1851 1999 1950 1905 2003 " (? (? (?> Subexpression) [13579] ( ?> A + B +) "1ABB", "3ABB", and "5AB" in "1ABB 3ABBC 5AB 5AC"

Character class in C #

A Character class in C # matches any character in a set of characters. Here are the Character classes in C #:

Character class DescriptionPattern Match [character_group] Matches any single character in character_group. By default, the matching task is to distinguish case-sensitive [mn] "m" in "mat" "m", "n" in "moon" [^ character_group] Negative: Any match What characters are not in character_group. By default, characters in character_group are case-sensitive [^ aei] "v", "l" in "avail" [first - last] Character arrays: Matches any character in the first character string to last [bd] [bd] irds Birds Cirds Dirds . Wildcard: Matches any single character except n ae "ave" in "have" "ate" in "mate" p {name} Matches any single character in the common Unicode style or block defined by name p {Lu} "C", "L" in "City Lights" P {name} Matches any single character that is not in the general Unicode or block defined by name P {Lu} "i", "t", "y" in "City" w Matches any word (word) w "R", "o", "m" and "1" in "Room # 1" W Matches any non-word characters (non-word) W "#" in "Room # 1" s Matches any whitespace characters ws "D" in "ID A1.3" S Matches any non-whitespace characters sS "_" in "int __ctr" d Matches any decimal digits d "4" in "4 = IV" D Matches any character other than a decimal digit "D" "," = " , "", "I", "V" in "4 = IV"

Quantifier in C #

Quantifier in C # determines how many instances of the previous element (which can be a character, a group, or a Character class) must be present in the input string for a match to occur.

Quantifier Description Pattern Matches * Matches the previous element 0 or more times d * .d ".0", "19.9", "219.9" + Matches the previous element 1 or more times "be +" "bee" in "is "," be "in" bent " ? Match the element before 0 or 1 "rai? N" "ran", "rain" {n} Matches the element before n times ", d {3}" ", 043" in "1,043.6", " , 876 ",", 543 ", and", 210 "in" 9,876,543,210 " {n,} Matches the previous element at least n times" d {2,} "" 166 "," 29 "," 1930 " {n, m} Matches with the previous element at least n times, but not more than m times "d {3.5}" "166", "17668" "19302" in "193024" *? Matches the element before 0 or more times, but with the least number of times d * ?. d ".0", "19.9", "219.9" +? Match the element before 1 or more times, but with the least number of possible "be +?" "be" in "been", "be" in "bent" ?? Matches the element before 0 or 1 time, but with the least number of times "rai ?? n" "ran", "rain" {n}? Matches the element before n times ", d {3}?" ", 043" in "1,043.6", ", 876", ", 543", and ", 210" in "9,876,543,210" {n,}? Matches the previous element at least n times, but with the least number of times possible "d {2,}?" "166", "29", "1930" {n, m}? Matches the previous element with the number of times in the range n and m, but for the number of times at least "d {3.5}?" "166", "17668" "193", "024" in "193024"

Backreference construct in C #

Backreference construct in C # allows a previously matched sub-expression to be defined next in the same Regular Expression.

This is a list of these constructs in C #:

Backreference construct DescriptionPatternTouch number backreference. Matches with the value of the numbered subexpression. (w) 1 "ee" in "seek" k Backreference has been named. Matches the value of the expression that has been named. (? w) k "ee" in "seek"

Alternation construct in C #

The alternation construct in C # modifies a Regular Expression to enable yes / no action matching. The following table is a list of Alternation constructs in C #:

Alternation construct DescriptionPattern Match | Matches any one element separately separated by (|) th (e | is | at) "the", "this" in "this is the day." (? (Expression) yes | no ) Matches yes if expression is matched; otherwise, match the part that is arbitrary. Expression is interpreted as a zero-width assertion (? (A) Ad {2} bd {3} b) "A10", "910" in "A10 C103 910" (? (Name) yes | no) Matches yes if name is caught with a match; otherwise, matching with arbitrary no (? ")? (? (quoted). +?" | S + s) Dogs.jpg, "Yiska playing.jpg" in "Dogs.jpg" Yiska playing .jpg ""

Substitution in C #

Substitution in C # is used in alternate patterns. Table below lists the Substitution in C #:

Character DescriptionPatternPattern replaces Input string Result string $ number Replace substring matched by number. b (w +) (s) (w +) b $ 3 $ 2 $ 1 "one two" "two one" $ { name } Replacing substring has been matched by groupname. b (? w +) (s) (? w +) b "one two" "two one" $$ Replace a constant "$". b (d +) s? USD $$$ 1 "103 USD" "$ 103" $ & Replace a copy of both matches. ($ * (d * (. + d +)?) {1}) ** $ & "$ 1.30" "** $ 1.30 **" $ ` Replace all the text of the previous input string to a match. B + $ `" AABBCC "" AAAACC " $ ' Replace all text of the input string after a match. B + $ '"AABBCC" "AACCCC" $ + Replace the last group that was captured. B + (C +) $ + "AABBCCDD" AACCDD $ _ Replace both input string.B + $ _ "AABBCC" "AAAABBCCCC"

Construct in C #

The table below lists the construct mixes in C #:

Construct Definition Example (? Imnsx-imnsx) Set or disable options such as distinguishing typefaces in the middle of a pattern. bA (? i) bw + b matches "ABA", "Able" in "ABA Able Act" (? #comment) Inline comment. The comment ends at the first single closing quotation mark. bA (? # Compare the numbers to A) w + b # [den cuoi dong] X-mode comment. Comments start at # and continue to the end of the line. (? X) bAw + b # Compare the numbers to A

Regex class in C #

The Regex class in C # is used to represent a Regular Expression. It has the following commonly used methods:

Formula 1 public bool IsMatch (string input)

Only whether or not Regular Expression given in this Regex constructor finds a match in the specified input string.

2 public bool IsMatch (string input, int startat)

Only that whether or not Regular Expression is given in this Regex constructor finds a match in the specified input string, starting at the given startat in the string.

3 public static bool IsMatch (string input, string pattern)

Only whether or not Regular Expression has found a match in the specified input string.

4 public MatchCollection Matches (string input)

Search for the specified input string for all occurrences of a Regular Expression.

5 public string Replace (string input, string replacement)

In a specified input string, replace all strings that match a Regular Expression pattern with a given replacement string.

6 public string [] Split (string input)

Divide an input string into an array of sub-strings at a location defined by a Regular Expression pattern defined in the Regex constructor.

For a complete list of methods and properties, please read Microsoft Documentation about C #.

Example 1: Match words beginning with S

 using System ; using System . Text . RegularExpressions ; namespace QTMCSharp { class Program { private static void showMatch ( string text , string expr ) { Console . WriteLine ( "Biểu thức: " + expr ); MatchCollection mc = Regex . Matches ( text , expr ); foreach ( Match m in mc ) { Console . WriteLine ( m ); } } static void Main ( string [] args ) { string str = "Sao hôm nay Sáng quá!" ; Console . WriteLine ( "So khớp các từ bắt đầu với 'S': " ); showMatch ( str , @ "bSS*" ); Console . ReadKey (); } } } 

Compile and run the C # program you will get the following result:

 Match words starting with 'S': 
Expression: bSS *
Star
shining

Example 2: Match words starting with c and ending with m

 using System ; using System . Text . RegularExpressions ; namespace QTMCSharp { class Program { private static void showMatch ( string text , string expr ) { Console . WriteLine ( "Biểu thức: " + expr ); MatchCollection a = Regex . Matches ( text , expr ); foreach ( Match b in a ) { Console . WriteLine ( b ); } } static void Main ( string [] args ) { string str = "Quản trị mạng chấm com" ; Console . WriteLine ( "So khớp từ bắt đầu với 'c' và kết thúc với 'm':" ); showMatch ( str , @ "bcS*mb" ); Console . ReadKey (); } } } 

Compiling and running the above C # program will produce the following output:

 Match words starting with 'c' and ending with 'm': 
Expressions: bcS * mb
dot
com

Example 3: Replace space (white space):

 using System ; using System . Text . RegularExpressions ; namespace RegExApplication { class Program { static void Main ( string [] args ) { string input = " QTM chào bạn! " ; string pattern = "s+" ; string replacement = " " ; Regex rgx = new Regex ( pattern ); string result = rgx . Replace ( input , replacement ); Console . WriteLine ( "Chuỗi ban đầu: {0}" , input ); Console . WriteLine ( "Chuỗi đã thay thế khoảng trống: {0}" , result ); Console . ReadKey (); } } } 

Compiling and running the above C # program will produce the following results:

 Original string: QTM hello! 
The string has replaced the gap: QTM greeted you!

According to Tutorialspoint

Previous article: Preprocessing directive in C #

Next lesson: Handling exceptions (Try / Catch / Finally) in C #

5 ★ | 1 Vote