Learnerslesson
   JAVA   
  SPRING  
  SPRINGBOOT  
 HIBERNATE 
  HADOOP  
   HIVE   
   ALGORITHMS   
   PYTHON   
   GO   
   KOTLIN   
   C#   
   RUBY   
   C++   




REGEX - META-CHARACTERS


Meta-Characters are the characters that is used by Regex to treat it in a different way.


So, we will be taking the String, Beautiful Nature,

java_Collections

And use all the Meta-Characters, to search for different patterns in the String.


Let us look at the different Meta-Characters, we can use :

  1. ^ Symbol



    The ^ symbol is used to check if a String starts with a certain character or characters.

    Say for example, let us take the above String,

    "Beautiful Nature"


    And let us check if the above String, starts with the letter B or Bea.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Beautiful Nature"
    	var x = Regex("^B")
    	var result = x.containsMatchIn(str)
    
    	if (result) {
    		println("The String ::Beautiful Nature:: starts with B")
    	}
    	else {
    		println("The String ::Beautiful Nature:: does not start with B")
    	}
    
    	var y = Regex("^Bea")
    	var result2 = y.containsMatchIn(str)
    
    	if (result2) {
    		println("The String ::Beautiful Nature:: starts with Bea")
    	}
    	else {
    		println("The String ::Beautiful Nature:: does not start with Bea")
    	}
    }
    


    Output :



      The String ::Beautiful Nature:: starts with B
      The String ::Beautiful Nature:: starts with Bea


    So, in the above example, we have taken the String Beautiful Nature and initialised it to a variable str.

    str = "Beautiful Nature"

    java_Collections


    So, at first we have checked, if the above String starts with the letter B.

    And this is where we have used the Regex("^B").

    var x = Regex("^B")


    Then comes the containsMatchIn() Function that accepts the actual String (i.e. Beautiful Nature) as an argument,
    java_Collections


    Now, let us come to the ^ Symbol, which is used in the pattern,

    ^B


    That actually says, "Check if the String starts with the letter B".

    And in this case the String starts with B. So, the control enters the blocks of if statement,

    if (result) {
    	println("The String ::Beautiful Nature:: starts with B")
    }


    And prints the value,

    The String ::Beautiful Nature:: starts with B


    Similarly, we have used the ^ Symbol in the next line,

    var y = Regex("^Bea")


    To check, if the String begins with Bea.

    ^Bea


    And in this case, the String begins with Bea. So, the control enters the blocks of if,

    if y:
    	print("The String ::Beautiful Nature:: starts with Bea")


    And the below output is printed on the screen,

    The String ::Beautiful Nature:: starts with Bea

  2. $ Symbol



    The $ symbol is used to check if a String ends with a certain character or characters.

    Say for example, let us take the above String,

    "Beautiful Nature"


    And let us check if the above String, ends with the letter e or ture.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Beautiful Nature"
    	var x = Regex("e$")
    	var result = x.containsMatchIn(str)
    
    	if (result) {
    		println("The String ::Beautiful Nature:: ends with e")
    	}
    	else {
    		println("The String ::Beautiful Nature:: does not end with e")
    	}
    
    	var y = Regex("ture$")
    	var result2 = y.containsMatchIn(str)
    
    	if (result2) {
    		println("The String ::Beautiful Nature:: ends with ture")
    	}
    	else {
    		println("The String ::Beautiful Nature:: does not end with ture")
    		}
    }
    


    Output :



      The String ::Beautiful Nature:: ends with e
      The String ::Beautiful Nature:: ends with ture


    So, in the above example, we have taken the String Beautiful Nature and initialised it to a variable str.

    str = "Beautiful Nature"


    And this is where we have used the Regex("e$").

    var x = Regex("e$")


    Then comes the containsMatchIn() Function that accepts the actual String (i.e. Beautiful Nature) as an argument,

    Now, let us come to the $ Symbol, which is used in the pattern,

    e$


    That actually says, "Check if the String ends with the letter e".

    And in this case the String ends with e. So, the control enters the blocks of if statement,

    if (result) {
    	println("The String ::Beautiful Nature:: ends with e")
    }


    And prints the value,

    The String ::Beautiful Nature:: ends with e


    Similarly, we have used the $ Symbol in the next line,

    y = re.findall("ture$", str)


    To check, if the String ends with ture.

    ture$


    And in this case, the String ends with ture. So, the control enters the blocks of if,

    if (result2) {
    	println("The String ::Beautiful Nature:: ends with ture")
    }


    And the below output is printed on the screen,

    The String ::Beautiful Nature:: ends with ture

  3. . Symbol



    The . symbol is used to check if a String matches any character except a new line.

    Say for example, let us take the above String,

    "Beautiful Nature"


    And let us say, you know there is a String but you forgot the spelling of Beautiful. You can remember it is, Beaut and it ends with an l.

    In such case . operator comes to rescue. Just substitute the unknown characters with .. And you can search for the actual String.

    'Beaut...l'


    The above . dots will be replaced by the actual value.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Beautiful Nature"
    	var x = Regex("Beaut...l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: Beautiful


    So, in the above example, we have taken the String Beautiful Nature and initialised it to a variable str.

    str = "Beautiful Nature"


    And checked, if at all there is a word, that starts with Beaut and ends with an l.

    var x = Regex("Beaut...l")


    And

    val match = x.find(str)


    And Kotlin checks the actual String,
    java_Collections


    And finds that there is a word, Beautiful that starts with Beaut and ends with an l. And the three dots . could be replaced by i, f and u.
    java_Collections


    So, the word Beautiful is fetched and put to the variable data.

    val match = x.find(str)
    var data = match?.value

    java_Collections

    Note : You have to put the exact number of dots . as the number of characters, else the pattern won't return the exact String.


    And the print statement,

    println("The matched string is :: "+data)


    Prints the List,

    The matched string is :: Beautiful

  4. + Symbol



    The + is used to check one or more occurrences of a pattern.

    Let us take a different String to understand the + Symbol,

    "Cool Guy"


    And let us say, you know there is a String but you forgot the spelling of Cool. And you are confused, if there is one o or two os in the string Cool.

    In such case + Symbol comes to rescue. Just place the + symbol after o.

    'Co+l'


    And no matter, how many os are there. The exact String would be returned.

    Let us see in the below example,

    Example :



    fun main() {
    
    	var str = "Cool Guy"
    	var x = Regex("Co+l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: Cool


    So, in the above example, we have checked, if at all there is a word, that starts with Co and ends with an l. No matter how many os are there.

    var x = Regex("Co+l")
    val match = x.find(str)
    var data = match?.value


    And Kotlin checks the actual String,

    And finds that there is a word, Cool that starts with Co and ends with an l. And there is one o that can be substituted.

    So, the word Cool is fetched and put to the variable data.

    var data = match?.value

    java_Collections


    And the print statement,

    println("The matched string is :: "+data)


    Prints the value of x as List,

    The matched string is :: Cool


    Even if there were multiple os in the String. The match would be found.

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cooooool Guy"
    	var x = Regex("Co+l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: Cooooool

  5. * Symbol



    The * is same as + Symbol with a little difference. It is used to check zero or more occurrences of a pattern.

    Let us take the below String to understand the * Symbol,

    "Cool Guy"


    And let us say, you know there is a String but you forgot the spelling of Cool. And you are confused, if there is one o or two os in the string Cool.

    In such case * Symbol comes to rescue. Just place the * symbol after o.

    'Co*l'


    And no matter, how many os are present. The exact String would be returned.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Guy"
    	var x = Regex("Co*l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: Cool


    So, in the above code, we have checked, if at all there is a word, that starts with Co and ends with an l. No matter how many os are there.

    var x = Regex("Co*l")
    val match = x.find(str)
    var data = match?.value


    And Kotlin checks the actual String,

    And finds that there is a word, Cool that starts with Co and ends with an l. And there is one o that can be substituted.

    So, the word Cool is fetched and put to the variable data.

    var data = match?.value

    java_Collections


    And the print statement,

    println("The matched string is :: "+data)


    Prints the value of x as List,

    The matched string is :: Cool


    So, how is * Symbol different from + Symbol?

    Let us consider the String where there is no os at all in the String.

    "Cl Guy"


    Let us see in the below example.

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cl Guy"
    	var x = Regex("Co*l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: Cl


    That is because the * Symbol searches for zero or more occurrences.

    So, the String,

    str = "Cl Guy"


    Doesn't have any os in it. Still the pattern,

    var x = Regex("Co*l")


    Is able to search the String Cl.

  6. ? Symbol



    The ? is same as + and * Symbol with a mild difference. It is used to check zero or exactly one occurrences in a pattern.

    Let us take the below String to understand the ? Symbol,

    "Cool Guy"


    So, at first, let us understand, how is ? Symbol different from + and * Symbol.

    And as we have done for the * Symbol, you know there is a String but you forgot the spelling of Cool.

    And you are confused, if there is one o or two os in the string Cool.

    Let's see, how ? symbol responds in such case.

    'Co?l'


    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Guy"
    	var x = Regex("Co?l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: null


    And we got an empty String(i.e. null).

    This is because the below line,

    var x = Regex("Co?l")


    Searches for the pattern Co?l.

    And ? Symbol searches for zero or one occurrence of o(i.e. Co?l). But in the String Cool Guy, The substring Cool has two os. So it returns nothing.

    So, when should ? Symbol find a match.

    Let us consider the String where there is no os or 1 o in the String.

    "Cl Guy"


    Or

    "Col Guy"


    Let us see in the next example.

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cl Guy"
    	var x = Regex("Co?l")
    
    	val match = x.find(str)
    
    	var data = match?.value
    
    	println("The matched string is :: "+data)
    }
    


    Output :



      The matched string is :: Cl

  7. | Symbol



    The | is like OR.

    Say for example, let us take the String,

    "Cool Guy"


    And let us say, you want to check, how many times o and u is there in the string?

    In such case | Symbol is the option. Just give the option of u|o(i.e. u or o).

    'u|o'


    And no matter, where u or o is. Kotlin finds them.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Guy"
    	var x = Regex("u|o")
    
    	val match = x.findAll(str)
    
    	match.forEach { word -> println("The matched letter is :: "+word.value) }
    }
    


    Output :



      The matched letter is :: o
      The matched letter is :: o
      The matched letter is :: u


    So, in the above example, we have checked, how many times o and u is there in the string?

    var x = Regex("u|o")


    And Kotlin checks the actual String,

    And finds that there is a word, Cool has two os and the word, Guy has one u.

    Now, if you see the output,

    The matched letter is :: o
    The matched letter is :: o
    The matched letter is :: u


    An Array called match is created,

    val match = x.findAll(str)


    The Array has three items in it. i.e. o, o and u.

  8. () Symbol



    The () is used to group Substrings.

    Say for example, let us take a different String,

    "Cool Owl"


    Now, let us say, you want to check, how many times the letters w or o is followed by the letter l.

    In such case () Symbol is the option.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Owl"
    	var x = Regex("(w|o)l")
    
    	val match = x.findAll(str)
    
    	match.forEach { word -> println("The matched word is :: "+word.value) }
    }
    


    Output :



      The matched word is :: ol
      The matched word is :: wl


    So, in the above example, we have checked, how many times the letters w or o is followed by the letter l.

    var x = Regex("(w|o)l")


    And we have placed the letters w and o inside the () Symbol. Also if you note, we have used the | (i.e. or) Symbol.

    Now, Kotlin checks the actual String, and tries to find a match where the the letters w or o is followed by the letter l.

    (w|o)l


    And finds that there is are two words, Cool and Owl.

    Where o is followed by l in Cool. And w is followed by l in Owl.

    Now, if you see the output,

    The matched word is :: ol
    The matched word is :: wl

  9. {} Symbol



    The {} is used to match the exact number of occurrences of a letter in a String.

    Say for example, let us take the String,

    "Cool Guy"


    Now, let us say, you want to check, if the letter C is followed by two os.

    In such case {} Symbol is the option.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Owl"
    	var x = Regex("Co{2}")
    
    	val match = x.findAll(str)
    
    	match.forEach { word -> println("The matched word is :: "+word.value) }
    }
    


    Output :



      The matched word is :: Coo


    So, in the above example, we have checked, if the letter C is followed by two os.

    var x = Regex("Co{2}")


    And we have placed the number 2 inside the {} Symbol.

    Now, Kotlin checks the actual String, and tries to find a match where the letter C is followed by two os.

    Co{2}


    And finds that there is are one match, Coo.

    Now, if you see the output,

    The matched word is :: Coo


    There is just one match i.e. Coo.

  10. [] Symbol



    The [] is used to match a set of characters in a String.

    Say for example, let us take the String,

    "Cool Guy"


    Now, let us say, you want to check, if the letters o, G, z and y are present in the String or not.

    In such case [] Symbol is the option.

    Let us see in the below example,

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Guy"
    	var x = Regex("[oGzy]")
    
    	val match = x.findAll(str)
    
    	match.forEach { word -> println("The matched word is :: "+word.value) }
    }
    


    Output :



      The matched word is :: o
      The matched word is :: o
      The matched word is :: G
      The matched word is :: y


    So, in the above example, we have checked, if the letters o, G, z and y are present in the String or not.

    var x = Regex("[oGzy]")


    And we have placed all the letters o, G, z and y inside the [] Symbol.

    Now, Kotlin checks the actual String, and tries to find a match for all the letters o, G, z and y inside the []

    [oGzy]


    And finds that there is are four matches.

    Now, if you see the output,

    The matched word is :: o
    The matched word is :: o
    The matched word is :: G
    The matched word is :: y


    You can find that there are four matches. i.e. Two match for o, and one match for G and y. But no match for z as z is not present in the String.

    Use of - with []



    We can use - with the [] symbol. Which is short form of to.

    The below pattern:

    [a-f]


    Says, search for all the characters from a to f(i.e. a, b, c, d, e and f).

    Similarly, the pattern :

    [0 - 100]


    Says, to search for all the numbers from 0 to 100.

    Also, the pattern :

    [0-9][0-9]


    Says, to search all the numbers from 00 to 99

    Use of ^ with []



    We can also use ^ with the [] symbol. Which is short form of not.

    The below pattern:

    [^oGzy]


    Says, search for all the characters except the letters o, G, z and y.

    Example :



    fun main(args: Array<String>) {
    
    	var str = "Cool Guy"
    	var x = Regex("[^oGzy]")
    
    	val match = x.findAll(str)
    
    	match.forEach { word -> println("The matched word is :: "+word.value) }
    }
    


    Output :



      The matched word is :: C
      The matched word is :: l
      The matched word is ::
      The matched word is :: u


    So, if you see the above output, only the letters o, G, z and y are excluded from the String Cool Guy.

    We will learn more examples with \ in the next tutorial, special sequences.