Regular Expression in C #

R expressionular is a pattern that can be matched to an input text. The .Net Framework provides a regular expression tool that allows such matching. In C #, a pattern consists of one or more character constants, operators, or construct.

There are many different types of characters, operators and constructs that help you define Regular Expressions in C #:

  1. Character escape
  2. Character class
  3. Anchor
  4. Grouping construct
  5. Quantifier
  6. Backreference construct
  7. Alternation construct
  8. Substitution
  9. Miscellaneous constructs

Construct to define Regular Expression in C #

  1. Character escape in C #
  2. Anchor in C #
  3. Grouping construct in C #
  4. Character class in C #
  5. Quantifier in C #
  6. Backreference construct in C #
  7. Alternation construct in C #
  8. Substitution in C #
  9. Construct in C #
  10. Regex class in C #

Character escape in C #

Basically, Escape Character in C # is special characters. The backslash character () in a regular expression indicates that the character follows it: either a special character or should be interpreted according to each character.

Here are the Escape Character in C #:

Escape character Description Pattern Match a match a bell character, u0007 a "u0007" in "Warning!" + 'u0007' b In a character class, match a backspace, u0008 [b] {3,} "bbbb" in "bbbb" t Match a tab, u0009 (w +) t "Namet", "Addrt" in "NametAddrt" r Compare a carriage return, u000D. (r is not equivalent to newline character (new line), n) rn (w +) "rnHello" in "rHellonWorld." v Match a vertical tab, u000B [v] {2,} "etc" in "etc" f Match a form feed, u000C [f] {2,} "fff" in "fff" n Match a newline ( new line), u000A rn (w +) "rnHello" in "rHellonWorld." e Match a escape, u001B e "x001B" in "x001B" nnn Use the representation of the base system 8 to determine a character (nnn consists of 3 digits) w40w "ab", "cd" in "a bc d " x nn Use the representation of the base system 16 to determine a character (nn consisting of 2 digits) wx20w" ab "," cd "in" a bc d " c Xc x Match the ASCII control character that is defined by X or x, with X or x being the letter of the cC control character "x0003" in "x0003" (Ctrl-C) u nnnn Matches a Unicode character by using hexadecimal representation (including 4 digits, as shown by nnnn) wu0020w "ab", "cd" in "a bc d" When followed by a character that is not recognized as an Escape Character, match that character + [+ -x *] d + d + [+ - x * d + "2 + 2" and "3 * 9" in "(2 + 2) * 3 * 9"

Anchor in C #

Anchoring allows a match to succeed or fail depending on the current position in the sequence. Below are the anchor in C #:

Assertion Description Match Pattern ^ The matching task must start at the beginning of the string or line. ^ d {3} "567" in "567-777-" $ Match must begin at the end of the string or before n at the end of the line or string. -d {4} $ "-2012" in "8-12-2012" A Match must start at the beginning of the string. Aw {3} "Code" in "Code-007-" Z Matches must start at the end of the string or before n at the end of the string. -d {3} Z "-007" in "Bond-901-007" z The match must start at the end of the string. -d {3} z "-333" in "-901-333" G Matches must start at the point where the previous match ends. G (d) "(1)", "(3)", "(5)" in "(1) (3) (5) [7] (9)" b Match must start on a middle limit one w (alphanumeric) and one W (not alphanumeric). w "R", "o", "m" and "1" in "Room # 1" B Match must not start on a limit b Bendw * b "ends", "ender" in "end" endure lender "

Grouping construct in C #

Grouping Construct in C # describes the sub-expressions of a Regular Expression and captures substring in a input string. The following table lists the Grouping Construct in C #:

Grouping construct DescriptionPattern Match (subexpression) Getting matched subexpression and assigning it a sequence number based on 0. (w) 1 "ee" in "deep" (? subexpression) Getting subexpression matched side in a named group. (? w) k "ee" in "deep" (? subexpression) Defines a balanced group definition. (((? 'Open' () [^ ()] *) + ((? 'Close-Open')) [^ ()] *) +) * (? (Open) (?!)) $ "( (1-3) * (3-1)) "in" 3 + 2 ^ ((1-3) * (3-1)) " (?: Subexpression ) Defines a noncapturing group. Write (?: Line)? "WriteLine" in "Console.WriteLine ()" (? Imnsx-imnsx: subexpression) Applies or disables the options specified inside the subexpression. Ad {2} (? I: w +) b "A12xl", "A12XL" in "A12xl A12XL a12xl" (? = Subexpression ) w + (? =.) "Is", "ran", and "out" in "He is the The ran dog The sun is out. " (?! subexpression) b (?! un) w + b "sure", "used" in "unsure sure unity used" (? <= subexpression) (? <= 19) d {2} b "51", " 03 "in" 1851 1999 1950 1905 2003 " (? (? (?> Subexpression) [13579] ( ?> A + B +) "1ABB", "3ABB", and "5AB" in "1ABB 3ABBC 5AB 5AC"

Character class in C #

A Character class in C # matches any character in a set of characters. Here are the Character classes in C #:

Character class DescriptionPattern Match [character_group] Matches any single character in character_group. By default, the matching task is to distinguish case-sensitive [mn] "m" in "mat" "m", "n" in "moon" [^ character_group] Negative: Any match What characters are not in character_group. By default, characters in character_group are case-sensitive [^ aei] "v", "l" in "avail" [first - last] Character arrays: Matches any character in the first character string to last [bd] [bd] irds Birds Cirds Dirds . Wildcard: Matches any single character except n ae "ave" in "have" "ate" in "mate" p {name} Matches any single character in the common Unicode style or block defined by name p {Lu} "C", "L" in "City Lights" P {name} Matches any single character that is not in the general Unicode or block defined by name P {Lu} "i", "t", "y" in "City" w Matches any word (word) w "R", "o", "m" and "1" in "Room # 1" W Matches any non-word characters (non-word) W "#" in "Room # 1" s Matches any whitespace characters ws "D" in "ID A1.3" S Matches any non-whitespace characters sS "_" in "int __ctr" d Matches any decimal digits d "4" in "4 = IV" D Matches any character other than a decimal digit "D" "," = " , "", "I", "V" in "4 = IV"

Quantifier in C #

Quantifier in C # determines how many instances of the previous element (which can be a character, a group, or a Character class) must be present in the input string for a match to occur.

Quantifier Description Pattern Matches * Matches the previous element 0 or more times d * .d ".0", "19.9", "219.9" + Matches the previous element 1 or more times "be +" "bee" in "is "," be "in" bent " ? Match the element before 0 or 1 "rai? N" "ran", "rain" {n} Matches the element before n times ", d {3}" ", 043" in "1,043.6", " , 876 ",", 543 ", and", 210 "in" 9,876,543,210 " {n,} Matches the previous element at least n times" d {2,} "" 166 "," 29 "," 1930 " {n, m} Matches with the previous element at least n times, but not more than m times "d {3.5}" "166", "17668" "19302" in "193024" *? Matches the element before 0 or more times, but with the least number of times d * ?. d ".0", "19.9", "219.9" +? Match the element before 1 or more times, but with the least number of possible "be +?" "be" in "been", "be" in "bent" ?? Matches the element before 0 or 1 time, but with the least number of times "rai ?? n" "ran", "rain" {n}? Matches the element before n times ", d {3}?" ", 043" in "1,043.6", ", 876", ", 543", and ", 210" in "9,876,543,210" {n,}? Matches the previous element at least n times, but with the least number of times possible "d {2,}?" "166", "29", "1930" {n, m}? Matches the previous element with the number of times in the range n and m, but for the number of times at least "d {3.5}?" "166", "17668" "193", "024" in "193024"

Backreference construct in C #

Backreference construct in C # allows a previously matched sub-expression to be defined next in the same Regular Expression.

This is a list of these constructs in C #:

Backreference construct DescriptionPatternTouch number backreference. Matches with the value of the numbered subexpression. (w) 1 "ee" in "seek" k Backreference has been named. Matches the value of the expression that has been named. (? w) k "ee" in "seek"

Alternation construct in C #

The alternation construct in C # modifies a Regular Expression to enable yes / no action matching. The following table is a list of Alternation constructs in C #:

Alternation construct DescriptionPattern Match | Matches any one element separately separated by (|) th (e | is | at) "the", "this" in "this is the day." (? (Expression) yes | no ) Matches yes if expression is matched; otherwise, match the part that is arbitrary. Expression is interpreted as a zero-width assertion (? (A) Ad {2} bd {3} b) "A10", "910" in "A10 C103 910" (? (Name) yes | no) Matches yes if name is caught with a match; otherwise, matching with arbitrary no (? ")? (? (quoted). +?" | S + s) Dogs.jpg, "Yiska playing.jpg" in "Dogs.jpg" Yiska playing .jpg ""

Substitution in C #

Substitution in C # is used in alternate patterns. Table below lists the Substitution in C #:

Character DescriptionPatternPattern replaces Input string Result string $ number Replace substring matched by number. b (w +) (s) (w +) b $ 3 $ 2 $ 1 "one two" "two one" $ { name } Replacing substring has been matched by groupname. b (? w +) (s) (? w +) b "one two" "two one" $$ Replace a constant "$". b (d +) s? USD $$$ 1 "103 USD" "$ 103" $ & Replace a copy of both matches. ($ * (d * (. + d +)?) {1}) ** $ & "$ 1.30" "** $ 1.30 **" $ ` Replace all the text of the previous input string to a match. B + $ `" AABBCC "" AAAACC " $ ' Replace all text of the input string after a match. B + $ '"AABBCC" "AACCCC" $ + Replace the last group that was captured. B + (C +) $ + "AABBCCDD" AACCDD $ _ Replace both input string.B + $ _ "AABBCC" "AAAABBCCCC"

Construct in C #

The table below lists the construct mixes in C #:

Construct Definition Example (? Imnsx-imnsx) Set or disable options such as distinguishing typefaces in the middle of a pattern. bA (? i) bw + b matches "ABA", "Able" in "ABA Able Act" (? #comment) Inline comment. The comment ends at the first single closing quotation mark. bA (? # Compare the numbers to A) w + b # [den cuoi dong] X-mode comment. Comments start at # and continue to the end of the line. (? X) bAw + b # Compare the numbers to A

Regex class in C #

The Regex class in C # is used to represent a Regular Expression. It has the following commonly used methods:

Formula 1 public bool IsMatch (string input)

Only whether or not Regular Expression given in this Regex constructor finds a match in the specified input string.

2 public bool IsMatch (string input, int startat)

Only that whether or not Regular Expression is given in this Regex constructor finds a match in the specified input string, starting at the given startat in the string.

3 public static bool IsMatch (string input, string pattern)

Only whether or not Regular Expression has found a match in the specified input string.

4 public MatchCollection Matches (string input)

Search for the specified input string for all occurrences of a Regular Expression.

5 public string Replace (string input, string replacement)

In a specified input string, replace all strings that match a Regular Expression pattern with a given replacement string.

6 public string [] Split (string input)

Divide an input string into an array of sub-strings at a location defined by a Regular Expression pattern defined in the Regex constructor.

For a complete list of methods and properties, please read Microsoft Documentation about C #.

Example 1: Match words beginning with S

 using System ; using System . Text . RegularExpressions ; namespace QTMCSharp { class Program { private static void showMatch ( string text , string expr ) { Console . WriteLine ( "Biểu thức: " + expr ); MatchCollection mc = Regex . Matches ( text , expr ); foreach ( Match m in mc ) { Console . WriteLine ( m ); } } static void Main ( string [] args ) { string str = "Sao hôm nay Sáng quá!" ; Console . WriteLine ( "So khớp các từ bắt đầu với 'S': " ); showMatch ( str , @ "bSS*" ); Console . ReadKey (); } } } 

Compile and run the C # program you will get the following result:

 Match words starting with 'S': 
Expression: bSS *
Star
shining

Example 2: Match words starting with c and ending with m

 using System ; using System . Text . RegularExpressions ; namespace QTMCSharp { class Program { private static void showMatch ( string text , string expr ) { Console . WriteLine ( "Biểu thức: " + expr ); MatchCollection a = Regex . Matches ( text , expr ); foreach ( Match b in a ) { Console . WriteLine ( b ); } } static void Main ( string [] args ) { string str = "Quản trị mạng chấm com" ; Console . WriteLine ( "So khớp từ bắt đầu với 'c' và kết thúc với 'm':" ); showMatch ( str , @ "bcS*mb" ); Console . ReadKey (); } } } 

Compiling and running the above C # program will produce the following output:

 Match words starting with 'c' and ending with 'm': 
Expressions: bcS * mb
dot
com

Example 3: Replace space (white space):

 using System ; using System . Text . RegularExpressions ; namespace RegExApplication { class Program { static void Main ( string [] args ) { string input = " QTM chào bạn! " ; string pattern = "s+" ; string replacement = " " ; Regex rgx = new Regex ( pattern ); string result = rgx . Replace ( input , replacement ); Console . WriteLine ( "Chuỗi ban đầu: {0}" , input ); Console . WriteLine ( "Chuỗi đã thay thế khoảng trống: {0}" , result ); Console . ReadKey (); } } } 

Compiling and running the above C # program will produce the following results:

 Original string: QTM hello! 
The string has replaced the gap: QTM greeted you!

According to Tutorialspoint

Previous article: Preprocessing directive in C #

Next lesson: Handling exceptions (Try / Catch / Finally) in C #

5 ★ | 1 Vote

May be interested

  • Handling exceptions (Try / Catch / Finally) in C #Photo of Handling exceptions (Try / Catch / Finally) in C #
    an exception is an issue that occurs during the execution of a program. an exception in c # is a response to an exception situation that occurs while a program is running, such as dividing by zero.
  • I / O file in C #Photo of I / O file in C #
    a file is a collection of data stored on the drive with a specific name and a directory path. when a file is opened for reading or writing, it becomes a stream.
  • Attribute in C #Photo of Attribute in C #
    the attribute in c #, is a declaration tag, used to transmit information to the runtime about the behavior of various elements such as classes, methods, structures, enum, assemblies, etc. in the program. yours. you can add declaration information to the program using attribute.
  • Reflection in C #Photo of Reflection in C #
    reflection objects are used to obtain type information at runtime. these classes provide access to the program's metadata running in the system.reflection namespace in c #.
  • Indexer in C #Photo of Indexer in C #
    indexer in c # helps index objects, such as an array. when you define an indexer for a class, this class operates similarly to a virtual array. then you can access the instance of this class with the array access operator in c # ([]).
  • Delegate in C #Photo of Delegate in C #
    delegate in c # is similar to pointers to functions, in c or in c ++. delegate is a reference type variable that contains references to a method. that reference can be changed at runtime.