Learnerslesson
   JAVA   
  SPRING  
  SPRINGBOOT  
 HIBERNATE 
  HADOOP  
   HIVE   
   ALGORITHMS   
   PYTHON   
   GO   
   KOTLIN   
   C#   
   RUBY   
   C++   




REGEX - SPECIAL SEQUENCES


Special Sequences in Regex are something that helps us write the patterns quite easy to write.


They begins with '\' and are followed by a letter giving a special meaning to it.


Let us see them below :


  1. \A

    The '\A' is used to check if a String starts with a certain character or characters.

    We will be taking the same example to explain '\A'.

    Let us say, we have the String,

    "Beautiful Nature"


    And let us check if the above String, starts with the letter 'B' or 'Bea'.

    Let us see in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\AB", str)
        
    if x:
        print("The String ::Beautiful Nature:: starts with B")
    else:
        print("The String ::Beautiful Nature:: does not start with B")
                
    y = re.findall("\ABea", str)
        
    if y:
        print("The String ::Beautiful Nature:: starts with Bea")
    else:
        print("The String ::Beautiful Nature:: does not start with Bea") 
    


    Output :



      The String ::Beautiful Nature:: starts with B
      The String ::Beautiful Nature:: starts with Bea


    So, in the above example, we have taken the String 'Beautiful Nature' and initialised it to a variable 'str'.

    str = "Beautiful Nature"


    java_Collections


    So, at first we have checked, if the above String starts with the letter 'B'.

    And this is where we have used the function 'findall( )'.

    x = re.findall("\AB", str)


    So, the 'findall( )' Function accepts two arguments, the pattern and the actual String.

    java_Collections


    Now, let us come to the '\A' Symbol, which is used in the pattern,

    \AB


    That actually says, "Check if the String starts with the letter B".

    And in this case the String starts with 'B'. So, the control enters the blocks of 'if' statement,

    if x:
        print("The String ::Beautiful Nature:: starts with B")
    


    And prints the value,

    The String ::Beautiful Nature:: starts with B


    Similarly, we have used the '^' Symbol in the next line,

    y = re.findall("\ABea", str)


    To check, if the String begins with 'Bea'.

    \ABea


    And in this case, the String begins with 'Bea'. So, the control enters the blocks of 'if',

    if y:
        print("The String ::Beautiful Nature:: starts with Bea")
    


    And the below output is printed on the screen,

    The String ::Beautiful Nature:: starts with Bea


    Example Description
    str = "Beautiful Nature"
    x = re.findall("\Ae", str)
    There is no match as the String doesn't
    begin with 'e'. But it begins with 'B'
    str = "Beautiful Nature"
    x = re.findall("\ABae", str)
    There is no match as the String doesn't
    begin with 'Bae'.
  2. \b

    Well! '\b' is for Backspace in Python.

    But! Hold On!

    In this case '\b' is a Special Sequence and is not a backspace.

    We will be taking the same example to explain '\b'.

    Let us say, we have the String,

    "Beautiful Nature"


    And let us check if the above String, starts with the letter 'B' or 'Bea'.

    Let us see in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\bBea", str)
        
    if x:
        print("The String ::Beautiful Nature:: starts with Bea")
    else:
        print("The String ::Beautiful Nature:: does not start with Bea")
    


    Output :



      The String ::Beautiful Nature:: does not start with Bea


    The output is a little weird. Even though the String 'Beautiful Nature' starts with 'Bea'.

    Still we got the output,

    The String ::Beautiful Nature:: does not start with Bea


    That is because the pattern in the below statement,

    x = re.findall("\bBea", str)


    Is a little confusing for Python. Because '\b' (i.e. In '\bBea') could be backspace as well.

    So, we get the above output.

    Let's fix the above code with a raw string prefix (i.e. 'r'). Let us see in the next example.

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall(r"\bBea", str)
        
    if x:
        print("The String ::Beautiful Nature:: starts with Bea")
    else:
        print("The String ::Beautiful Nature:: does not start with Bea")
    


    Output :



      The String ::Beautiful Nature:: starts with Bea


    And all we have done is placed the raw string prefix (i.e. 'r') before '\b'.

    x = re.findall(r"\bBea", str)


    And this time '\b' is treated as Special Sequence.

    So, it checks if the String 'Beautiful Nature' begins with 'Bea'.

    And in this case the String begins with 'Bea'.

    Example Description
    str = "Beautiful Nature"
    x = re.findall(r"\bure", str)
    There is a match as the String ends with
    'ure'. And '\b' is also used to check if
    a String ends with a set of characters.
    str = "Beautiful Nature"
    x = re.findall("\be", str)
    There is a match as the String begins
    with 'e'.
    str = "Beautiful Nature"
    x = re.findall(r"\be", str)
    There is no match as the String doesn't
    begin with 'e'. But it begins with 'B'
    str = "Beautiful Nature"
    x = re.findall("\bBae", str)
    There is no match as the String doesn't
    begin with 'Bae'.
  3. \B

    We have already seen that the lower case 'b' i.e. '\b', is used to check if a String begins or ends with a certain character or characters.

    Well! The upper case 'B' i.e. '\B' is just the opposite of it.

    i.e. '\B', is used to check if a String does not begins or ends with a certain character or characters.

    Sounds complex?

    Let us clear with the below example.

    Let us say, we have the String,

    "Beautiful Nature"


    And let us check if the above String, starts with the letter 'B' or 'Bea'.

    Let us see in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\BBea", str)
        
    if x:
        print("The String ::Beautiful Nature:: starts with Bea")
    else:
        print("The String ::Beautiful Nature:: does not start with Bea")
    


    Output :



      The String :: Nature:: does not start with Bea


    The output is a little weird. Even though the String 'Beautiful Nature' starts with 'Bea'.

    Still we got the output,

    The String ::Beautiful Nature:: does not start with Bea


    That is because the pattern in the below statement,

    x = re.findall("\BBea", str)


    Is used to check if the String i.e. 'Beautiful Nature' does not start with 'Bea'.

    And in this case it starts with 'Bea'. So, there is a mismatch.

    So, we get the above output.

    Let's fix the above code in the next example.

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\Bful", str)
        
    if x:
        print("The String ::Beautiful Nature:: does not start with ful")
    else:
        print("The String ::Beautiful Nature:: starts with ful")
    


    Output :



      The String ::Beautiful Nature:: does not start with ful


    And all we have done is, put some other set of characters i.e. 'ful', with which the String 'Beautiful Nature' doesn't starts or ends.

    x = re.findall("\Bful", str)


    And this time we got the right output.

    The String ::Beautiful Nature:: does not start with ful


    Example Description
    str = "Beautiful Nature"
    x = re.findall("\Bure", str)
    There is a mismatch as the String ends with
    'ure'. And '\B' is used to check if a String
    doesn't end with a set of characters.
    str = "Beautiful Nature"
    x = re.findall("\Be", str)
    There is a mismatch as the String ends with
    'e'.
  4. \d

    '\d', is used to check if a String contains at least one number.

    Let us clear with the below example..

    Let us say, we have the String,

    "Beautiful Nature"


    And let us check if the above String has any number or not? Well! We know it doesn't.

    Let us see that in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\d", str)
        
    if x:
        print("The String ::Beautiful Nature:: has a number in it")
    else:
        print("The String ::Beautiful Nature:: does not have a number in it")
    


    Output :



      The String ::Beautiful Nature:: does not have a number in it


    And the output is obvious.

    Since, the above String 'Beautiful Nature' has no numbers in it. The expression,

    x = re.findall("\d", str)


    Didn't find a match.

    Let's fix the above code in the next example.

    Example :



    import re 
       
    str = "Beautiful Nature 92"
    x = re.findall("\d", str)
        
    if x:
        print("The String ::Beautiful Nature 92:: has a number in it")
    else:
        print("The String ::Beautiful Nature 92:: does not have a number in it")
    


    Output :



      The String ::Beautiful Nature 92:: has a number in it


    And all we have done is, modified the String to 'Beautiful Nature 92'.

    str = "Beautiful Nature 92"


    And this time, the String has a number in it. So the below expression,

    x = re.findall("\d", str)


    Finds a match,

    And we get the below output,

    The String ::Beautiful Nature 92:: has a number in it
  5. \D

    '\D', is the opposite of '\d' is used to check that the String should not contain any number.

    Let us clear with the below example..

    Let us say, we have the String,

    "Beautiful Nature"


    Which doesn't have any numbers in it

    Let us see that in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\D", str)
        
    if x:
        print("The String ::Beautiful Nature:: does not have any number in it")
    else:
        print("The String ::Beautiful Nature:: has a number in it")
    


    Output :



      The String ::Beautiful Nature:: does not have any number in it


    And the output is obvious.

    Since, the above String 'Beautiful Nature' has no numbers in it. The expression,

    x = re.findall("\D", str)


    Found a match.

    Let's see the next example, where there is a number in the String.

    Example :



    import re 
       
    str = "Beautiful Nature 92"
    x = re.findall("\D", str)
        
    if x:
        print("The String ::Beautiful Nature 92:: does not have any number in it")
    else:
        print("The String ::Beautiful Nature 92:: has a number in it")
    


    Output :



      The String ::Beautiful Nature:: has a number in it


    And all we have done is, modified the String to 'Beautiful Nature 92'.

    str = "Beautiful Nature 92"


    And this time, the String has a number in it. So the below expression,

    x = re.findall("\Dful", str)


    Doesn't find a match because '\D' is the opposite of '\d'.

    And we get the below output,

    The String ::Beautiful Nature 92:: has a number in it
  6. \s

    '\s', is used to check if the String has a white space in it.

    Let us clear with the below example..

    Let us say, we have the String,

    "Beautiful Nature"


    And there is a white space between 'Beautiful' and 'Nature'. So '\s' would return true.

    Let us see that in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("\s", str)
        
    if x:
        print("The String ::Beautiful Nature:: has a white space in it")
    else:
        print("The String ::Beautiful Nature:: does not have a white space in it")
    


    Output :



      The String ::Beautiful Nature:: has a white space in it


    And the output is obvious.

    Since, the above String 'Beautiful Nature' has a white space between 'Beautiful' and 'Nature'. The expression,

    x = re.findall("\s", str)


    Found a match.
  7. \S

    '\S' is the opposite of '\s' and is used to check if there is no white space in the String.

    Let us clear with the below example..

    Let us say, we have the String,

    "Beautiful Nature"


    And there is a white space between 'Beautiful' and 'Nature'. So '\S' would return false.

    Let us see that in the below example,

    Example :



    import re 
       
    str = "Beautiful"
    x = re.findall("\S", str)
        
    if x:
        print("The String ::Beautiful:: does not have a white space in it")
    else:
        print("The String ::Beautiful:: has a white space in it")
    


    Output :



      The String ::Beautiful:: does not have a white space in it


    Even this time the output is obvious.

    We have used the String 'Beautiful' this time. And there is no white space in it.

    str = "Beautiful"


    So, the expression,

    x = re.findall("\S", str)


    Finds a match because there is no white space in the String.

    And we get the below output,

    The String ::Beautifu:: does not have a white space in it
  8. \w

    '\w' is used to check if a String contains at least one of the characters from a to z or A to Z or 0 to 9 or an underscore '_'. If the String contains anything other than that it won't find a match.

    Let us clear with the below example.

    Let us say, we have the String,

    "Beautiful Nature!!"


    And there are two characters '!', which '\w' doesn't accept. So, does it finds a match or not?

    Let us see that in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature!!"
    x = re.findall("\w", str)
        
    print(x)
        
    if x:
        print("The String ::Beautiful Nature!!:: match found")
    else:
        print("The String ::Beautiful Nature!!:: match is not found")
    


    Output :



      ['B', 'e', 'a', 'u', 't', 'i', 'f', 'u', 'l']
      The String ::Beautiful Nature!!:: match found


    So, in the above example, we have used the String 'Beautiful Nature!!'

    str = "Beautiful Nature!!"


    So, the expression,

    x = re.findall("\w", str)


    Finds a match because there is at least one match. i.e. If you see the print statement,

    print(x)


    The output shows a list,

    ['B', 'e', 'a', 'u', 't', 'i', 'f', 'u', 'l']


    That has all possible matches. But the above list doesn't have '!' in it. That is because it doesn't match the criteria of '\w'.

    And we get the below output,

    The String ::Beautiful Nature!!:: match found


    Similarly, let us look at the next example, where we have the below String,

    "$@ &!"


    So, let us see the below example with the String '$@ &!'.

    Example :



    import re 
       
    str = "$@ &!"
    x = re.findall("\w", str)
        
    print(x)
        
    if x:
        print("The String ::$@ &!:: match found")
    else:
        print("The String ::$@ &!:: match is not found")
    


    Output :



      The String ::$@ &!:: match is not found


    So, in the above output, we can see the match is not found for the String '$@ &!'.

    That is because the String '$@ &!' doesn't have a single character that matches the condition of '\w'.
  9. \W

    '\W' is just the opposite of '\w'. And is used to check if a String contains does not have \ at least one of the characters from a to z or A to Z or 0 to 9 or an underscore '_'.

    Let us say, we have the String,

    "Beautiful Nature!!"


    And there are two characters '!', which '\W' should accept. So, does it finds a match or not?

    Let us see that in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature!!"
    x = re.findall("\W", str)
     
    print(x)
        
    if x:
        print("The String ::Beautiful Nature!!:: match found")
    else:
        print("The String ::Beautiful Nature!!:: match is not found")
    


    Output :



      [' ', '!', '!']
      The String ::Beautiful Nature!!:: match found


    So, in the above example, we have used the String 'Beautiful Nature!!'

    str = "Beautiful Nature!!"


    So, the expression,

    x = re.findall("\W", str)


    Finds a match because there is at least one match. i.e. If you see the print statement,

    print(x)


    The output shows a list,

    [' ', '!', '!']


    That has the match for a white space ' ' and '!'.

    Since, it is the inverse of '\w'. It finds a match for the above.

    And we get the below output,

    The String ::Beautiful Nature!!:: match found
  10. \Z

    '\Z' is used to check if a String ends with a certain character or characters.

    Say for example, let us take the above String,

    "Beautiful Nature"


    And let us check if the above String, ends with the letter 'e' or 'ture'.

    Let us see in the below example,

    Example :



    import re 
       
    str = "Beautiful Nature"
    x = re.findall("e\Z", str)
        
    if x:
        print("The String ::Beautiful Nature:: ends with e")
    else:
        print("The String ::Beautiful Nature:: does not end with e")
                
    y = re.findall("ture\Z", str)
        
    if y:
        print("The String ::Beautiful Nature:: ends with ture")
    else:
        print("The String ::Beautiful Nature:: does not end with ture")
    


    Output :



      The String ::Beautiful Nature:: ends with e
      The String ::Beautiful Nature:: ends with ture


    So, in the above example, we have checked, if the above String ends with the letter 'e'.

    x = re.findall("e\Z", str)


    And since, the String 'Beautiful Nature' ends with 'e'. We got the below output.

    The String ::Beautiful Nature:: ends with e


    Similarly, we have used the '\Z' Symbol in the next line,

    y = re.findall("ture\Z", str)


    To check, if the String ends with 'ture'.

    ture\Z


    And in this case, the String ends with 'ture'. So, we got the below output.

    The String ::Beautiful Nature:: ends with ture