([A-Z])\w+

A regular expression is a strange-looking sequence of symbols and characters expressing a string or pattern within a longer piece of text. Each symbol within the regular expression represents an attribute that the chosen string contains. Regular expressions look funny, they seem like they could be a code curse-word, but they are extremely useful for extracting information from text such as code, log files, spreadsheets and documents.

What are they?

Below are the different symbols that are used in sequence within a regular expression to isolate strings with their attributes:

RegEx Symbol Represents RegEx Symbol Represents
abc... letters 123... digits
\d any digit \D any non-digit
. any character \. period
[abc] only a, b or c [^abc] not a, b or c
[a-z] characters a-z [0-9] numbers 0-9
\w any alphanumeric character \W any non-alphanumeric character
{m} m repetitions {m-n} m to n repetitions
* zero or more repetitions + one or more repetitions
? optional character \s any whitespace
\S any non-whitespace character ^...$ starts and ends
(...) capture group (a(bc)) capture sub-group
(.*) capture all (abc | def) matches abc or def

These symbols above are used in conjunction to make an expression that represents the qualities of the string they isolate. Start and end a regular expression with a / and write your symbols in between. For instance if we wanted to isolate and return urls we could write /(http:.+)/g. The brackets capture a grouping that contains “http:” followed by one or more repetitions of any symbol. The g here allows us to search globally, so all instances within the file are returned.

If you want to practise writing regular expressions, there is a great tutorial that The Odin Project recommends at RegExOne and I have also found a helpful regular expressions tester at RegExr if you want to have a fiddle.